ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem
ivshmem can be configured with and without interrupt capability
(a.k.a. "doorbell"). The two configurations have largely disjoint
options, which makes for a confusing (and badly checked) user
interface. Moreover, the device can't tell the guest whether its
doorbell is enabled.
Create two new device models ivshmem-plain and ivshmem-doorbell, and
deprecate the old one.
Changes from ivshmem:
* PCI revision is 1 instead of 0. The new revision is fully backwards
compatible for guests. Guests may elect to require at least
revision 1 to make sure they're not exposed to the funny "no shared
memory, yet" state.
* Property "role" replaced by "master". role=master becomes
master=on, role=peer becomes master=off. Default is off instead of
auto.
* Property "use64" is gone. The new devices always have 64 bit BARs.
Changes from ivshmem to ivshmem-plain:
* The Interrupt Pin register in PCI config space is zero (does not use
an interrupt pin) instead of one (uses INTA).
* Property "x-memdev" is renamed to "memdev".
* Properties "shm" and "size" are gone. Use property "memdev"
instead.
* Property "msi" is gone. The new device can't have MSI-X capability.
It can't interrupt anyway.
* Properties "ioeventfd" and "vectors" are gone. They're meaningless
without interrupts anyway.
Changes from ivshmem to ivshmem-doorbell:
* Property "msi" is gone. The new device always has MSI-X capability.
* Property "ioeventfd" defaults to on instead of off.
* Property "size" is gone. The new device can only map all the shared
memory received from the server.
Guests can easily find out whether the device is configured for
interrupts by checking for MSI-X capability.
Note: some code added in sub-optimal places to make the diff easier to
review. The next commit will move it to more sensible places.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
ivshmem: Simplify memory regions for BAR 2 (shared memory)
ivshmem_realize() puts the shared memory region in a container region.
Used to be necessary to permit delayed mapping of the shared memory.
However, we recently moved to synchronous mapping, in "ivshmem:
Receive shared memory synchronously in realize()" and the commit
following it. The container is redundant since then. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-33-git-send-email-armbru@redhat.com>
ivshmem has its very own code to create and map shared memory.
Replace that with an implicitly created memory backend. Reduces the
number of ways we create BAR 2 from three to two.
The memory-backend-file is currently available only with CONFIG_LINUX,
so this adds a second Linuxism to ivshmem (the other one is eventfd).
Should we ever need to make it portable to systems where
memory-backend-file can't be made to serve, we could create a
memory-backend-shmem that allocates memory with shm_open().
Bonus fix: shared memory files are now created with permissions 0655
instead of 0777.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-32-git-send-email-armbru@redhat.com>
ivshmem: Simplify how we cope with short reads from server
Short reads from a UNIX domain sockets are exceedingly unlikely when
the other side always sends eight bytes and we always read eight
bytes. We cope with them anyway. However, the code doing that is
rather convoluted. Dumb it down radically.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-30-git-send-email-armbru@redhat.com>
ivshmem: Drop the hackish test for UNIX domain chardev
The chardev must be capable of transmitting SCM_RIGHTS ancillary
messages. We check it by comparing CharDriverState member filename to
"unix:". That's almost as brittle as it is disgusting.
When the actual transmission all happened asynchronously, this check
was all we could do in realize(), and thus better than nothing. But
now we receive at least one SCM_RIGHTS synchronously in realize(),
it's not worth its keep anymore. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-29-git-send-email-armbru@redhat.com>
ivshmem: Rely on server sending the ID right after the version
The protocol specification (ivshmem-spec.txt, formerly
ivshmem_device_spec.txt) has always required the ID message to be sent
right at the beginning, and ivshmem-server has always complied. The
device, however, accepts it out of order. If an interrupt setup
arrived before it, though, it would be misinterpreted as connect
notification. Fix the latent bug by relying on the spec and
ivshmem-server's actual behavior.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-28-git-send-email-armbru@redhat.com>
ivshmem: Receive shared memory synchronously in realize()
When configured for interrupts (property "chardev" given), we receive
the shared memory from an ivshmem server. We do so asynchronously
after realize() completes, by setting up callbacks with
qemu_chr_add_handlers().
Keeping server I/O out of realize() that way avoids delays due to a
slow server. This is probably relevant only for hot plug.
However, this funny "no shared memory, yet" state of the device also
causes a raft of issues that are hard or impossible to work around:
* The guest is exposed to this state: when we enter and leave it its
shared memory contents is apruptly replaced, and device register
IVPosition changes.
This is a known issue. We document that guests should not access
the shared memory after device initialization until the IVPosition
register becomes non-negative.
For cold plug, the funny state is unlikely to be visible in
practice, because we normally receive the shared memory long before
the guest gets around to mess with the device.
For hot plug, the timing is tighter, but the relative slowness of
PCI device configuration has a good chance to hide the funny state.
In either case, guests complying with the documented procedure are
safe.
* Migration becomes racy.
If migration completes before the shared memory setup completes on
the source, shared memory contents is silently lost. Fortunately,
migration is rather unlikely to win this race.
If the shared memory's ramblock arrives at the destination before
shared memory setup completes, migration fails.
There is no known way for a management application to wait for
shared memory setup to complete.
All you can do is retry failed migration. You can improve your
chances by leaving more time between running the destination QEMU
and the migrate command.
To mitigate silent memory loss, you need to ensure the server
initializes shared memory exactly the same on source and
destination.
These issues are entirely undocumented so far.
I'd expect the server to be almost always fast enough to hide these
issues. But then rare catastrophic races are in a way the worst kind.
This is way more trouble than I'm willing to take from any device.
Kill the funny state by receiving shared memory synchronously in
realize(). If your hot plug hangs, go kill your ivshmem server.
For easier review, this commit only makes the receive synchronous, it
doesn't add the necessary error propagation. Without that, the funny
state persists. The next commit will do that, and kill it off for
real.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
ivshmem: Plug leaks on unplug, fix peer disconnect
close_peer_eventfds() cleans up three things: ioeventfd triggers if
they exist, eventfds, and the array to store them.
Commit 98609cd (v1.2.0) fixed it not to clean up ioeventfd triggers
when they don't exist (property ioeventfd=off, which is the default).
Unfortunately, the fix also made it skip cleanup of the eventfds and
the array then. This is a memory and file descriptor leak on unplug.
Additionally, the reset of nb_eventfds is skipped. Doesn't matter on
unplug. On peer disconnect, however, this permanently wedges the
interrupt vectors used for that peer's ID. The eventfds stay behind,
but aren't connected to a peer anymore. When the ID gets recycled for
a new peer, the new peer's eventfds get assigned to vectors after the
old ones. Commonly, the device's number of vectors matches the
server's, so the new ones get dropped with a "Too many eventfd
received" message. Interrupts either don't work (common case) or go
to the wrong vector.
Fix by narrowing the conditional to just the ioeventfd trigger
cleanup.
While there, move the "invalid" peer check to the only caller where it
can actually happen, and tighten it to reject own ID.
Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-25-git-send-email-armbru@redhat.com>
ivshmem: Simplify rejection of invalid peer ID from server
ivshmem_read() processes server messages. These are 64 bit signed
integers. -1 is shared memory setup, 16 bit unsigned is a peer ID,
anything else is invalid.
ivshmem_read() rejects invalid negative messages right away, silently.
Invalid positive messages get rejected only in resize_peers(), and
ivshmem_read() then prints the rather cryptic message "failed to
resize peers array".
Extend the first check to cover all invalid messages, make it report
"server sent invalid message", and drop the second check.
Now resize_peers() can't fail anymore; simplify.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-23-git-send-email-armbru@redhat.com>
An interrupt is set up when the interrupt's file descriptor is
received. Each message applies to the next interrupt vector.
Therefore, each vector cannot be set up more than once.
ivshmem_add_kvm_msi_virq() half-heartedly tries not to rely on this by
doing nothing then, but that's not going to recover from this error
should it become possible in the future. watch_vector_notifier()
doesn't even try.
Simply assert what is the case, so we get alerted if we ever screw it
up.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-22-git-send-email-armbru@redhat.com>
The ivshmem device can either use MSI-X or legacy INTx for interrupts.
With MSI-X enabled, peer interrupt events trigger an MSI as they
should. But software can still raise INTx via interrupt status and
mask register in BAR 0. This is explicitly prohibited by PCI Local
Bus Specification Revision 3.0, section 6.8.3.3:
While enabled for MSI or MSI-X operation, a function is prohibited
from using its INTx# pin (if implemented) to request service (MSI,
MSI-X, and INTx# are mutually exclusive).
Fix the device model to leave INTx alone when using MSI-X.
Document that we claim to use INTx in config space even when we don't.
Unlike other devices, ivshmem does *not* use INTx when configured for
MSI-X and MSI-X isn't enabled by software.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-21-git-send-email-armbru@redhat.com>
* ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X
variant of the device is selected with msi=off.
* msix_present() is true when the device has the PCI capability MSI-X.
It's initially false, and becomes true during successful realize of
the MSI-X variant of the device. Thus, it's the same as
ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices.
* msix_enabled() is true when msix_present() is true and guest software
has enabled MSI-X.
Code that differs between the non-MSI-X and the MSI-X variant of the
device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or
by msix_present(), except the latter works only for realized devices.
Code that depends on whether MSI-X is in use needs to be guarded with
msix_enabled().
Code review led me to two minor messes:
* ivshmem_vector_notify() calls msix_notify() even when
!msix_enabled(), unlike most other MSI-X-capable devices. As far as
I can tell, msix_notify() does nothing when !msix_enabled(). Add
the guard anyway.
* Most callers of ivshmem_use_msix() guard it with
ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because
ivshmem_use_msix() does nothing when !msix_present(). That's
ivshmem's only use of msix_present(), though. Guard it
consistently, and drop the now redundant msix_present() check.
While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
We reuse errp after passing it host_memory_backend_get_memory(). If
both host_memory_backend_get_memory() and the reuse set an error, the
reuse will fail the assertion in error_setv(). Fortunately,
host_memory_backend_get_memory() can't fail.
Pass it &error_abort to make our assumption explicit, and to get the
assertion failure in the right place should it become invalid.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-17-git-send-email-armbru@redhat.com>
ivshmem: Don't destroy the chardev on version mismatch
Yes, the chardev is commonly useless after we read a bad version from
it, but destroying it is inappropriate anyway: the user created it, so
the user should be able to hold on to it as long as he likes. We
don't destroy it on other errors. Screwed up in commit 5105b1d.
Stop reading instead.
Also note QEMU's behavior in ivshmem-spec.txt.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-16-git-send-email-armbru@redhat.com>
This started as an attempt to update ivshmem_device_spec.txt for
clarity, accuracy and completeness while working on its code, and
quickly became a full rewrite. Since the diff would be useless
anyway, I'm using the opportunity to rename the file to
ivshmem-spec.txt.
I tried hard to ensure the new text contradicts neither the old text
nor the code. If the new text contradicts the old text but not the
code, it's probably a bug in the old text. If the new text
contradicts both, its probably a bug in the new text.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-11-git-send-email-armbru@redhat.com>
ivshmem-test: Improve test cases /ivshmem/server-*
Document missing test: behavior with MSI-X present but not enabled.
For MSI-X, we test and clear the interrupt pending bit before testing
the interrupt. For INTx, we only clear. Change to test and clear for
consistency.
Test MSI-X vector 1 in addition to vector 0.
Improve comments.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-10-git-send-email-armbru@redhat.com>
ivshmem-test: Clean up wait for devices to become operational
test_ivshmem_server() waits until the first byte in BAR 2 contains the
0x42 we put into shared memory. Works because the byte reads zero
until the device maps the shared memory gotten from the server.
Check the IVPosition register instead: it's initially -1, and becomes
non-negative right when the device maps the share memory, so no
change, just cleaner, because it's what guest software is supposed to
do.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-9-git-send-email-armbru@redhat.com>
tests/libqos/pci-pc: Fix qpci_pc_iomap() to map BARs aligned
qpci_pc_iomap() maps BARs one after the other, without padding. This
is wrong. PCI Local Bus Specification Revision 3.0, 6.2.5.1. Address
Maps: "all address spaces used are a power of two in size and are
naturally aligned". That's because the size of a BAR is given by the
number of address bits the device decodes, and the BAR needs to be
mapped at a multiple of that size to ensure the address decoding
works.
Fix qpci_pc_iomap() accordingly. This takes care of a FIXME in
ivshmem-test.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-7-git-send-email-armbru@redhat.com>
event_notifier: Make event_notifier_init_fd() #ifdef CONFIG_EVENTFD
Event notifiers are designed for eventfd(2). They can fall back to
pipes, but according to Paolo, event_notifier_init_fd() really
requires the real thing, and should therefore be under #ifdef
CONFIG_EVENTFD. Do that.
Its only user is ivshmem, which is currently CONFIG_POSIX. Narrow it
to CONFIG_EVENTFD.
Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-6-git-send-email-armbru@redhat.com>
ivshmem-server: Don't overload POSIX shmem and file name
Option -m NAME is interpreted as directory name if we can statfs() it
and its on hugetlbfs. Else it's interpreted as POSIX shared memory
object name. This is nuts.
Always interpret -m as directory. Create new -M for POSIX shared
memory. Last of -m or -M wins.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-4-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The code to find the minimum page size is is vulnerable to TOCTTOU.
Added in commit 2d103aa "target-ppc: fix hugepage support when using
memory-backend-file" (v2.4.0). Since I can't fix it myself right now,
add a FIXME comment.
Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-2-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Peter Maydell [Thu, 17 Mar 2016 15:59:42 +0000 (15:59 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Thu 17 Mar 2016 15:49:29 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (29 commits)
iotests: Test QUORUM_REPORT_BAD in fifo mode
quorum: Emit QUORUM_REPORT_BAD for reads in fifo mode
block: Use blk_co_pwritev() in blk_co_write_zeroes()
block: Use blk_aio_prwv() for aio_read/write/write_zeroes
block: Use blk_prw() in blk_pread()/blk_pwrite()
block: Use blk_co_pwritev() in blk_write_zeroes()
block: Pull up blk_read_unthrottled() implementation
block: Use blk_co_pwritev() for blk_write()
block: Use blk_co_preadv() for blk_read()
block: Use BdrvChild in BlockBackend
block: Remove bdrv_states list
block: Use bdrv_next() instead of bdrv_states
block: Rewrite bdrv_next()
block: Add blk_next_root_bs()
block: Add bdrv_next_monitor_owned()
block: Move some bdrv_*_all() functions to BB
blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()
blockdev: Split monitor reference from BB creation
blockdev: Separate BB name management
blockdev: Add list of all BlockBackends
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alberto Garcia [Tue, 15 Mar 2016 09:41:36 +0000 (11:41 +0200)]
iotests: Test QUORUM_REPORT_BAD in fifo mode
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: c0a8dbfdbe939520cda5f661af6f1cd7b6b4df9d.1458034554.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
Alberto Garcia [Tue, 15 Mar 2016 09:41:35 +0000 (11:41 +0200)]
quorum: Emit QUORUM_REPORT_BAD for reads in fifo mode
If there's an I/O error in one of Quorum children then QEMU
should emit QUORUM_REPORT_BAD. However this is not working with
read-pattern=fifo. This patch fixes this problem.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: d57e39e8d3e8564003a1e2aadbd29c97286eb2d2.1458034554.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Tue, 8 Mar 2016 12:47:47 +0000 (13:47 +0100)]
block: Use blk_co_preadv() for blk_read()
This patch introduces blk_co_preadv() as a central function on the
BlockBackend level that is supposed to handle all read requests from the
BB to its root BDS eventually.
Max Reitz [Wed, 16 Mar 2016 18:54:43 +0000 (19:54 +0100)]
block: Rewrite bdrv_next()
Instead of using the bdrv_states list, iterate over all the
BlockDriverStates attached to BlockBackends, and over all the
monitor-owned BDSs afterwards (except for those attached to a BB).
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:38 +0000 (19:54 +0100)]
blockdev: Split monitor reference from BB creation
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.
In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".
If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:37 +0000 (19:54 +0100)]
blockdev: Separate BB name management
Introduce separate functions (monitor_add_blk() and
monitor_remove_blk()) which set or unset a BB name. Since the name is
equivalent to the monitor's reference to a BB, adding a name the same as
declaring the BB to be monitor-owned and removing it revokes this
status, hence the function names.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:35 +0000 (19:54 +0100)]
blockdev: Rename blk_backends
The blk_backends list does not contain all BlockBackends but only the
ones which are referenced by the monitor, and that is not necessarily
true for every BlockBackend. Rename the list to monitor_block_backends
to make that fact clear.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:34 +0000 (19:54 +0100)]
block: Drop BB name from bad option error
The information which BB is concerned does not seem useful enough to
justify its existence in most other place (which may be related to qemu
printing the -drive parameter in question anyway, and for blockdev-add
the attribution is naturally unambiguous). Furthermore, as of a future
patch, bdrv_get_device_name(bs) will always return the empty string
before bdrv_open_inherit() returns.
Therefore, just dropping that information seems to be the best course of
action.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:33 +0000 (19:54 +0100)]
qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
Just specifying a custom string is simpler in basically all places that
used it, and in addition, specifying the BB or node name is something we
generally do not do in other error messages when opening a BDS, so we
should not do it here.
This changes the output for iotest 036 (to the better, in my opinion),
so the reference output needs to be changed accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:31 +0000 (19:54 +0100)]
block: Add blk_commit_all()
Later, we will remove bdrv_commit_all() and move its contents here, and
in order to replace bdrv_commit_all() calls by calls to blk_commit_all()
before doing so, we need to add it as an alias now.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 16 Mar 2016 18:54:30 +0000 (19:54 +0100)]
block: Use blk_next() in block-backend.c
Instead of iterating directly through blk_backends, we can use
blk_next() instead. This gives us some abstraction from the list itself
which we can use to rename it, for example.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter Maydell [Thu, 17 Mar 2016 11:27:54 +0000 (11:27 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Thu 17 Mar 2016 11:08:28 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
Note that commit df9a681dc9ad41c9cdeb9ecc5d060ba9abd27e01 included some
unrelated hunks, possibly due to a merge failure or an overlooked
squash. This only reverts the qed .bdrv_drain() implementation.
The qed .bdrv_drain() implementation is unsafe and can lead to a double
request completion.
Paolo Bonzini reports:
"The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
unconditionally, but this is not correct if an allocating write is
queued. In this case, qed_unplug_allocating_write_reqs will restart the
allocating write and possibly cause it to complete. The aiocb however
is still in use for the L2/L1 table writes, and will then be completed
again as soon as the table writes are stable."
For QEMU 2.6 we can simply revert this commit. A full solution for the
qed need check timer may be added if the bdrv_drain() implementation is
extended.
Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1457431876-8475-1-git-send-email-stefanha@redhat.com
Eduardo Habkost [Tue, 16 Feb 2016 20:59:07 +0000 (18:59 -0200)]
module: Rename machine_init() to opts_init()
The only remaining users of machine_init() only call
qemu_add_opts(). Rename machine_init() to opts_init() and move it
closer to the qemu_add_opts() calls on vl.c.
Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Peter Maydell [Wed, 16 Mar 2016 17:43:37 +0000 (17:43 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160316-1' into staging
target-arm queue:
* loader: Fix incorrect parameter name in load_image_mr()
* Implement MRS (banked) and MSR (banked) instructions
* virt: Implement versioning for machine model
* i.MX: some initial patches preparing for i.MX6 support
* new ASPEED AST2400 SoC and palmetto-bmc machine
* bcm2835: add some more raspi2 devices
* sd: fix segfault running "info qtree"
# gpg: Signature made Wed 16 Mar 2016 17:42:43 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
* remotes/pmaydell/tags/pull-target-arm-20160316-1: (21 commits)
sd: Fix "info qtree" on boards with SD cards
bcm2835_dma: add emulation of Raspberry Pi DMA controller
bcm2835_property: implement framebuffer control/configuration properties
bcm2835_fb: add framebuffer device for Raspberry Pi
bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block
bcm2835_peripherals: enable sdhci pending-insert quirk for raspberry pi
hw/arm: Add palmetto-bmc machine
hw/arm: Add ASPEED AST2400 SoC model
hw/intc: Add (new) ASPEED VIC device model
hw/timer: Add ASPEED timer device model
i.MX: Add missing descriptions in devices.
i.MX: Add i.MX6 CCM and ANALOG device.
i.MX: Add the CLK_IPG_HIGH clock
i.MX: Remove CCM useless clock computation handling.
i.MX: Rename CCM NOCLK to CLK_NONE for naming consistency.
i.MX: Allow GPT timer to rollover.
arm: virt: Move machine class init code to the abstract machine type
arm: virt: Add an abstract ARM virt machine type
target-arm: Fix translation level on early translation faults
target-arm: Implement MRS (banked) and MSR (banked) instructions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 16 Mar 2016 17:06:02 +0000 (17:06 +0000)]
sd: Fix "info qtree" on boards with SD cards
The SD card object is not a SysBusDevice, so don't create it with
qdev_create() if we're not assigning it to a specific bus; use
object_new() instead.
This was causing 'info qtree' to segfault on boards with SD cards,
because qdev_create(NULL, TYPE_FOO) puts the created object on the
system bus, and then we may try to run functions like sysbus_dev_print()
on it, which fail when casting the object to SysBusDevice.
(This is the same mistake that we made with the NAND device
and fixed in commit 6749695eaaf346c1.)
Grégory ESTRADE [Wed, 16 Mar 2016 17:06:02 +0000 (17:06 +0000)]
bcm2835_dma: add emulation of Raspberry Pi DMA controller
At present, all DMA transfers complete inline (so a looping descriptor
queue will lock up the device). We also do not model pause/abort,
arbitrarion/priority, or debug features.
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-6-git-send-email-Andrew.Baumann@microsoft.com
[AB: implement 2D mode, cleanup/refactoring for upstream submission] Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The property channel driver now interfaces with the framebuffer device
to query and set framebuffer parameters. As a result of this, the "get
ARM RAM size" query now correctly returns the video RAM base address
(not total RAM size), and the ram-size property is no longer relevant
here.
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-5-git-send-email-Andrew.Baumann@microsoft.com
[AB: cleanup/refactoring for upstream submission] Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Grégory ESTRADE [Wed, 16 Mar 2016 17:06:01 +0000 (17:06 +0000)]
bcm2835_fb: add framebuffer device for Raspberry Pi
The framebuffer occupies the upper portion of memory (64MiB by
default), but it can only be controlled/configured via a system
mailbox or property channel (to be added by a subsequent patch).
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-4-git-send-email-Andrew.Baumann@microsoft.com
[AB: added Windows (BGR) support and cleanup/refactoring for upstream submission] Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Baumann [Wed, 16 Mar 2016 17:06:01 +0000 (17:06 +0000)]
bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block
At present only the core UART functions (data path for tx/rx) are
implemented, which is enough for UEFI to boot. The following
features/registers are unimplemented:
* Line/modem control
* Scratch register
* Extra control
* Baudrate
* SPI interfaces
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-3-git-send-email-Andrew.Baumann@microsoft.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Baumann [Wed, 16 Mar 2016 17:06:01 +0000 (17:06 +0000)]
bcm2835_peripherals: enable sdhci pending-insert quirk for raspberry pi
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-2-git-send-email-Andrew.Baumann@microsoft.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Jeffery [Wed, 16 Mar 2016 17:06:01 +0000 (17:06 +0000)]
hw/arm: Add palmetto-bmc machine
The new machine is a thin layer over the AST2400 ARM926-based SoC[1].
Between the minimal machine and the current SoC implementation there is
enough functionality to boot an aspeed_defconfig Linux kernel to
userspace. Nothing yet is specific to the Palmetto's BMC (other than
using an AST2400 SoC), but creating specific machine types is preferable
to a generic machine that doesn't match any particular hardware.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-5-git-send-email-andrew@aj.id.au Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Jeffery [Wed, 16 Mar 2016 17:06:00 +0000 (17:06 +0000)]
hw/arm: Add ASPEED AST2400 SoC model
While the ASPEED AST2400 SoC[1] has a broad range of capabilities this
implementation is minimal, comprising an ARM926 processor, ASPEED VIC
and timer devices, and a 8250 UART.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-4-git-send-email-andrew@aj.id.au Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Jeffery [Wed, 16 Mar 2016 17:06:00 +0000 (17:06 +0000)]
hw/intc: Add (new) ASPEED VIC device model
Implement a basic ASPEED VIC device model for the AST2400 SoC[1], with
enough functionality to boot an aspeed_defconfig Linux kernel. The model
implements the 'new' (revised) register set: While the hardware exposes
both the new and legacy register sets, accesses to the model's legacy
register set will not be serviced (however the access will be logged).
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-3-git-send-email-andrew@aj.id.au Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andrew Jeffery [Wed, 16 Mar 2016 17:06:00 +0000 (17:06 +0000)]
hw/timer: Add ASPEED timer device model
Implement basic ASPEED timer functionality for the AST2400 SoC[1]: Up to
8 timers can independently be configured, enabled, reset and disabled.
Some hardware features are not implemented, namely clock value matching
and pulse generation, but the implementation is enough to boot the Linux
kernel configured with aspeed_defconfig.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-2-git-send-email-andrew@aj.id.au Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wei Huang [Wed, 16 Mar 2016 17:05:59 +0000 (17:05 +0000)]
arm: virt: Move machine class init code to the abstract machine type
This patch moves the common class initialization code from
"virt-2.6" to the new abstract class. An empty property is added to
"virt-2.6" machine. In the meanwhile, related funtions are renamed
to "virt_2_6_*" for consistency.
Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-3-git-send-email-wei@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wei Huang [Wed, 16 Mar 2016 17:05:59 +0000 (17:05 +0000)]
arm: virt: Add an abstract ARM virt machine type
In preparation for future ARM virt machine types, this patch creates
an abstract type for all ARM machines. The current machine type in
QEMU (i.e. "virt") is renamed to "virt-2.6", whose naming scheme is
similar to other architectures. For the purpose of backward compatibility,
"virt" is converted to an alias, pointing to "virt-2.6". With this patch,
"qemu -M ?" lists the following virtual machine types along with others:
virt QEMU 2.6 ARM Virtual Machine (alias of virt-2.6)
virt-2.6 QEMU 2.6 ARM Virtual Machine
Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-2-git-send-email-wei@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Sergey Sorokin [Wed, 16 Mar 2016 17:05:58 +0000 (17:05 +0000)]
target-arm: Fix translation level on early translation faults
Qemu reports translation fault on 1st level instead of 0th level in case of
AArch64 address translation if the translation table walk is disabled or
the address is in the gap between the two regions.
Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-id: 1457527503-25958-1-git-send-email-afarallax@yandex.ru Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 16 Mar 2016 17:05:58 +0000 (17:05 +0000)]
target-arm: Implement MRS (banked) and MSR (banked) instructions
Starting with the ARMv7 Virtualization Extensions, the A32 and T32
instruction sets provide instructions "MSR (banked)" and "MRS
(banked)" which can be used to access registers for a mode other
than the current one:
* R<m>_<mode>
* ELR_hyp
* SPSR_<mode>
Implement the missing instructions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1456762734-23939-1-git-send-email-peter.maydell@linaro.org
Jens Wiklander [Wed, 16 Mar 2016 17:05:58 +0000 (17:05 +0000)]
loader: Fix incorrect parameter name in load_image_mr() macro
Fix a typo in the load_image_mr() macro: 'mr' was written when
the parameter name is '_mr'. (This had no visible effects since
the single use of the macro used 'mr' as the argument.)
Peter Maydell [Tue, 23 Feb 2016 14:18:32 +0000 (14:18 +0000)]
util/base64.c: Clean includes
Remove unnecessary include of config-host.h.
(This was missed by the clean-includes script because of the
incorrect use of <> for a QEMU header.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-5-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 23 Feb 2016 14:18:31 +0000 (14:18 +0000)]
update-linux-headers.sh: Fake types.h doesn't need to include anything
We have a fake linux/types.h which we create in update-linux-headers.h.
Now that every QEMU source file includes osdep.h, this fake header
doesn't need to include anything at all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-4-git-send-email-peter.maydell@linaro.org