Paolo Bonzini [Thu, 20 Oct 2011 11:16:25 +0000 (13:16 +0200)]
block: change discard to co_discard
Since coroutine operation is now mandatory, convert both bdrv_discard
implementations to coroutines. For qcow2, this means taking the lock
around the operation. raw-posix remains synchronous.
The bdrv_discard callback is then unused and can be eliminated.
Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Thu, 20 Oct 2011 11:16:24 +0000 (13:16 +0200)]
block: change flush to co_flush
Since coroutine operation is now mandatory, convert all bdrv_flush
implementations to coroutines. For qcow2, this means taking the lock.
Other implementations are simpler and just forward bdrv_flush to the
underlying protocol, so they can avoid the lock.
The bdrv_flush callback is then unused and can be eliminated.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Thu, 20 Oct 2011 11:16:23 +0000 (13:16 +0200)]
block: take lock around bdrv_write implementations
This does the first part of the conversion to coroutines, by
wrapping bdrv_write implementations to take the mutex.
Drivers that implement bdrv_write rather than bdrv_co_writev can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Thu, 20 Oct 2011 11:16:22 +0000 (13:16 +0200)]
block: take lock around bdrv_read implementations
This does the first part of the conversion to coroutines, by
wrapping bdrv_read implementations to take the mutex.
Drivers that implement bdrv_read rather than bdrv_co_readv can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.
raw-win32 does not need the lock, because it cannot yield.
nbd also doesn't probably, but better be safe.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Thu, 20 Oct 2011 11:16:21 +0000 (13:16 +0200)]
block: add a CoMutex to synchronous read drivers
The big conversion of bdrv_read/write to coroutines caused the two
homonymous callbacks in BlockDriver to become reentrant. It goes
like this:
1) bdrv_read is now called in a coroutine, and calls bdrv_read or
bdrv_pread.
2) the nested bdrv_read goes through the fast path in bdrv_rw_co_entry;
3) in the common case when the protocol is file, bdrv_co_do_readv calls
bdrv_co_readv_em (and from here goes to bdrv_co_io_em), which yields
until the AIO operation is complete;
4) if bdrv_read had been called from a bottom half, the main loop
is free to iterate again: a device model or another bottom half
can then come and call bdrv_read again.
This applies to all four of read/write/flush/discard. It would also
apply to is_allocated, but it is not used from within coroutines:
besides qemu-img.c and qemu-io.c, which operate synchronously, the
only user is the monitor. Copy-on-read will introduce a use in the
block layer, and will require converting it.
The solution is "simply" to convert all drivers to coroutines! We
just need to add a CoMutex that is taken around affected operations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If that can happen, however, the code is bogus. vmdk_parent_open
reads from bs->file:
if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
but it is always called with s->desc_offset == 0 and with the same
bs->file. So the data that vmdk_parent_open reads comes always from the
same place, and anyway there is only one place where it can write it,
namely bs->backing_file.
So, if it cannot happen, the patched code is okay.
It is also possible that the recursive call can happen, but only once. In
that case there would still be a bug in vmdk_open_desc_file setting
s->desc_offset = 0, but the patched code is okay.
Finally, in the case where multiple recursive calls can happen the code
would need to be rewritten anyway. It is likely that this would anyway
involve adding several parameters to vmdk_parent_open, and calling it from
vmdk_open_vmdk4.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Thu, 20 Oct 2011 14:37:26 +0000 (16:37 +0200)]
pc: Fix floppy drives with if=none
Commit 63ffb564 broke floppy devices specified on the command line like
-drive file=...,if=none,id=floppy -global isa-fdc.driveA=floppy because it
relies on drive_get() which works only with -fda/-drive if=floppy.
This patch resembles what we're already doing for IDE, i.e. remember the floppy
device that was created and use that to extract the BlockDriverStates where
needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
Kevin Wolf [Tue, 18 Oct 2011 15:12:44 +0000 (17:12 +0200)]
qcow2: Fix bdrv_write_compressed error handling
If during allocation of compressed clusters the cluster was already allocated
uncompressed, fail and properly release the l2_table (the latter avoids a
failed assertion).
While at it, make it return some real error numbers instead of -1.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Kevin Wolf [Tue, 18 Oct 2011 14:41:45 +0000 (16:41 +0200)]
fdc: Fix floppy port I/O
The floppy device was broken by commit 212ec7ba (fdc: Convert to
isa_register_portio_list). While the old interface provided the port number
relative to the floppy drive's io_base, the new one provides the real port
number, so we need to apply a bitmask now to get the register number.
Stefan Hajnoczi [Mon, 17 Oct 2011 10:32:13 +0000 (12:32 +0200)]
block: drop redundant bdrv_flush implementation
Block drivers now only need to provide either of .bdrv_co_flush,
.bdrv_aio_flush() or for legacy drivers .bdrv_flush(). Remove
the redundant .bdrv_flush() implementations.
[Paolo Bonzini: change raw driver to bdrv_co_flush]
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Mon, 17 Oct 2011 10:32:12 +0000 (12:32 +0200)]
block: unify flush implementations
Add coroutine support for flush and apply the same emulation that
we already do for read/write. bdrv_aio_flush is simplified to always
go through a coroutine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Fri, 21 Oct 2011 10:16:44 +0000 (12:16 +0200)]
xen_disk: Always set feature-barrier = 1
The synchronous .bdrv_flush callback doesn't exist any more and a device really
shouldn't poke into the block layer internals anyway. All drivers are supposed
to have a correctly working bdrv_flush, so let's just hard-code this.
Peter Maydell [Tue, 18 Oct 2011 15:12:54 +0000 (16:12 +0100)]
hw/omap2: Wire up the IRQ for the 2430's fifth GPIO module
The OMAP2430 version of the omap-gpio device has five GPIO modules,
not four like the other OMAP2 versions; wire up the fifth module's
IRQ line correctly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
In file included from trace.c:2:0:
trace.h: In function ‘trace_v9fs_attach’:
trace.h:2850:9: error: too many arguments for format [-Werror=format-extra-args]
trace.h: In function ‘trace_v9fs_wstat’:
trace.h:3039:9: error: too many arguments for format [-Werror=format-extra-args]
trace.h: In function ‘trace_v9fs_mkdir’:
trace.h:3088:9: error: too many arguments for format [-Werror=format-extra-args]
trace.h: In function ‘trace_v9fs_mkdir_return’:
trace.h:3095:9: error: too many arguments for format [-Werror=format-extra-args]
Fix the format strings and also use %u instead of %d for unsigned values
in the changed strings. There are more minor errors of this kind
which I did not fix because that would make the review more difficult.
v2: Fixed position of } for v9fs_mkdir_return.
Cc: Harsh Prateek Bora <harsh@linux.vnet.ibm.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Wed, 23 Feb 2011 18:56:52 +0000 (19:56 +0100)]
migration: propagate error correctly
unix and tcp outgoing migration have error values, but didn't returned
it. Make them return the error. Notice that EINPROGRESS & EWOULDBLOCK
are not considered errors as call will be retry later.
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Tue, 11 May 2010 21:18:34 +0000 (23:18 +0200)]
migration: Our release callback was just free
We called it from a single place, and always with state !=
MIG_STATE_ACTIVE. Just remove the whole callback. For users of the
notifier, notice that this is exactly the case where they don't care,
we are just freeing the state from previous failed migration (it can't
be a sucessful one, otherwise we would not be running on that machine
in the first place).
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Tue, 22 Feb 2011 22:32:54 +0000 (23:32 +0100)]
migration: Introduce migrate_fd_completed() for consistency
This function is a bit different of the others that change the state,
in the sense that if migrate_fd_cleanup() returns an error, it set the
status to error, not completed.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Juan Quintela [Wed, 21 Sep 2011 20:32:08 +0000 (22:32 +0200)]
buffered_file: reuse QEMUFile has_error field
Instead of having two has_error fields in QEMUFile & QEMUBufferedFile,
reuse the 1st one. Notice that the one in buffered_file is only set
after a file operation.
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Wed, 21 Sep 2011 16:12:58 +0000 (18:12 +0200)]
buffered_file: Use right "opaque"
buffered_close 's' variable is of type QEMUFileBuffered, and
wait_for_unfreeze() expect to receive a MigrationState, that
'coincidentaly' is s->opaque.
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Yoshiaki Tamura [Tue, 22 Feb 2011 15:01:24 +0000 (00:01 +0900)]
migration: add error handling to migrate_fd_put_notify().
Although migrate_fd_put_buffer() sets MIG_STATE_ERROR if it failed,
since migrate_fd_put_notify() isn't checking error of underlying
QEMUFile, those resources are kept open. This patch checks it and
calls migrate_fd_error() in case of error.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Tue, 4 Oct 2011 13:28:31 +0000 (15:28 +0200)]
savevm: improve subsections detection on load
We add qemu_peek_buffer, that is identical to qemu_get_buffer, just
that it don't update f->buf_index.
We add a paramenter to qemu_peek_byte() to be able to peek more than
one byte.
Once this is done, to see if we have a subsection we look:
- 1st byte is QEMU_VM_SUBSECTION
- 2nd byte is a length, and is bigger than section name
- 3rd element is a string that starts with section_name
So, we shouldn't have false positives (yes, content could still get us
wrong but probabilities are really low).
v2:
- Alex Williamsom found that we could get negative values on index.
- Rework code to fix that part.
- Rewrite qemu_get_buffer() using qemu_peek_buffer()
v3:
- return "done" on error case
v4:
- fix qemu_file_skip() off by one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Luiz Capitulino [Fri, 14 Oct 2011 14:18:09 +0000 (11:18 -0300)]
runstate: Allow user to migrate twice
It should be a matter of allowing the transition POSTMIGRATE ->
FINISH_MIGRATE, but it turns out that the VM won't do the
transition the second time because it's already stopped.
So this commit also adds vm_stop_force_state() which performs
the transition even if the VM is already stopped.
Luiz Capitulino [Thu, 13 Oct 2011 14:36:40 +0000 (11:36 -0300)]
savevm: qemu_savevm_state(): Drop stop VM logic
qemu_savevm_state() has some logic to stop the VM and to (or not to)
resume it. But this seems to be a big noop, as qemu_savevm_state()
is only called by do_savevm() when the VM is already stopped.
So, let's drop qemu_savevm_state()'s stop VM logic.
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Luiz Capitulino [Thu, 13 Oct 2011 17:39:59 +0000 (14:39 -0300)]
runstate: Allow to transition from paused to postmigrate
The user may already have paused the VM before starting the
migration process. If s/he does that, then the state will be
'paused' when we finish the migration process. In that case
we want to transition from 'paused' to 'postmigrate' as the
latter is now the real reason why the VM is stopped.
Jan Kiszka [Fri, 7 Oct 2011 07:19:53 +0000 (09:19 +0200)]
i8259: Convert to qdev
This key cleanup step requires to move the IRQ debugging bit from
i8259_set_irq directly to the per-PIC pic_set_irq, to pass the PIC
parameters (I/O base, ELCR address and mask, master/slave mode) as
qdev properties, and to interconnect the PICs with their environment via
GPIO pins.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Jan Kiszka [Fri, 7 Oct 2011 07:19:51 +0000 (09:19 +0200)]
i8259: Eliminate PicState2
Introduce a reference to the slave PIC for the few cases we need to
access it without a proper pointer at hand and drop PicState2. We could
even live without slave_pic if we had a better way of modeling the
cascade bus the PICs are attached to (in addition to the ISA bus).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Jan Kiszka [Fri, 7 Oct 2011 07:19:50 +0000 (09:19 +0200)]
i8259: Replace PicState::pics_state with master flag
This reflects how real PICs indentify their role (in non-buffered mode):
Pass the state of the /SP input on pic_init and use it instead of
pics_state to differentiate between master and slave mode.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Jan Kiszka [Fri, 7 Oct 2011 07:19:49 +0000 (09:19 +0200)]
i8259: PREP: Replace pic_intack_read with pic_read_irq
There is nothing in the i8259 spec that justifies the special
pic_intack_read. At least the Linux PREP kernels configure the PICs
properly so that pic_read_irq returns identical values, and setting
read_reg_select in PIC0 cannot be derived from any special i8259 mode.
So switch ppc_prep to pic_read_irq and drop the now unused PIC code.
CC: Andreas Färber <andreas.faerber@web.de> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Jan Kiszka [Fri, 7 Oct 2011 07:19:47 +0000 (09:19 +0200)]
i8259: Fix poll command
This was probably never used so far: According to the spec, polling
means ack'ing the pending IRQ and setting its corresponding bit in isr.
Moreover, we have to signal a pending IRQ via bit 7 of the returned
value, and we must not return a spurious IRQ if none is pending.
This implements the poll command without the help of pic_poll_read which
is left untouched as pic_intack_read is still using it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>