Roger Pau Monne [Thu, 24 Aug 2017 15:07:03 +0000 (16:07 +0100)]
xen/pt: allow QEMU to request MSI unmasking at bind time
When a MSI interrupt is bound to a guest using
xc_domain_update_msi_irq (XEN_DOMCTL_bind_pt_irq) the interrupt is
left masked by default.
This causes problems with guests that first configure interrupts and
clean the per-entry MSIX table mask bit and afterwards enable MSIX
globally. In such scenario the Xen internal msixtbl handlers would not
detect the unmasking of MSIX entries because vectors are not yet
registered since MSIX is not enabled, and vectors would be left
masked.
Introduce a new flag in the gflags field to signal Xen whether a MSI
interrupt should be unmasked after being bound.
This also requires to track the mask register for MSI interrupts, so
QEMU can also notify to Xen whether the MSI interrupt should be bound
masked or unmasked
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reported-by: Andreas Kinzler <hfp@posteo.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit a8036336609d2e184fc3543a4c439c0ba7d7f3a2)
Gerd Hoffmann [Mon, 28 Aug 2017 12:29:06 +0000 (14:29 +0200)]
vga: stop passing pointers to vga_draw_line* functions
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).
Impact: DoS for privileged guest users. qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.
Jan Beulich [Wed, 21 Jun 2017 15:42:41 +0000 (16:42 +0100)]
xen/disk: don't leak stack data via response ring
Rather than constructing a local structure instance on the stack, fill
the fields directly on the shared ring, just like other (Linux)
backends do. Build on the fact that all response structure flavors are
actually identical (the old code did make this assumption too).
This is XSA-216.
Reported-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Gerd Hoffmann [Tue, 14 Feb 2017 18:09:59 +0000 (19:09 +0100)]
cirrus/vnc: zap bitblit support from console code.
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests. It is supported by cirrus and vnc server. The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.
This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more. Any linux guest using the cirrus drm
driver doesn't. Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.
So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".
Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full). This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display. So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.
Gerd Hoffmann [Tue, 21 Feb 2017 18:54:59 +0000 (10:54 -0800)]
cirrus: add blit_is_unsafe call to cirrus_bitblt_cputovideo
CIRRUS_BLTMODE_MEMSYSSRC blits do NOT check blit destination
and blit width, at all. Oops. Fix it.
Security impact: high.
The missing blit destination check allows to write to host memory.
Basically same as CVE-2014-8106 for the other blit variants.
The missing blit width check allows to overflow cirrus_bltbuf,
with the attractive target cirrus_srcptr (current cirrus_bltbuf write
position) being located right after cirrus_bltbuf in CirrusVGAState.
Due to cirrus emulation writing cirrus_bltbuf bytewise the attacker
hasn't full control over cirrus_srcptr though, only one byte can be
changed. Once the first byte has been modified further writes land
elsewhere.
Bruce Rogers [Tue, 21 Feb 2017 18:54:38 +0000 (10:54 -0800)]
display: cirrus: ignore source pitch value as needed in blit_is_unsafe
Commit 4299b90 added a check which is too broad, given that the source
pitch value is not required to be initialized for solid fill operations.
This patch refines the blit_is_unsafe() check to ignore source pitch in
that case. After applying the above commit as a security patch, we
noticed the SLES 11 SP4 guest gui failed to initialize properly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 20170109203520.5619-1-brogers@suse.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Li Qiang [Thu, 9 Feb 2017 22:36:41 +0000 (14:36 -0800)]
cirrus: fix oob access issue (CVE-2017-2615)
When doing bitblt copy in backward mode, we should minus the
blt width first just like the adding in the forward mode. This
can avoid the oob access of the front of vga's vram.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
{ kraxel: with backward blits (negative pitch) addr is the topmost
address, so check it as-is against vram size ]
Cc: qemu-stable@nongnu.org Cc: P J P <ppandit@redhat.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Wolfgang Bumiller <w.bumiller@proxmox.com> Fixes: d3532a0db02296e687711b8cdc7791924efccea0 (CVE-2014-8106) Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485938101-26602-1-git-send-email-kraxel@redhat.com Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
P J P [Mon, 25 Jul 2016 12:07:18 +0000 (17:37 +0530)]
virtio: error out if guest exceeds virtqueue size
A broken or malicious guest can submit more requests than the virtqueue
size permits.
The guest can submit requests without bothering to wait for completion
and is therefore not bound by virtqueue size. This requires reusing
vring descriptors in more than one request, which is incorrect but
possible. Processing a request allocates a VirtQueueElement and
therefore causes unbounded memory allocation controlled by the guest.
Exit with an error if the guest provides more requests than the
virtqueue size permits. This bounds memory allocation and makes the
buggy guest visible to the user.
upstream-commit-id: afd9096eb1882f23929f5b5c177898ed231bac66 Reported-by: Zhenhao Hong <zhenhaohong@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Ian Jackson [Thu, 26 May 2016 15:21:56 +0000 (16:21 +0100)]
main loop: Big hammer to fix logfile disk DoS in Xen setups
Each time round the main loop, we now fstat stderr. If it is too big,
we dup2 /dev/null onto it. This is not a very pretty patch but it is
very simple, easy to see that it's correct, and has a low risk of
collateral damage.
There is no limit by default but can be adjusted by setting a new
environment variable.
This fixes CVE-2014-3672.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Set the default to 0 so that it won't affect non-xen installation. The
limit will be set by Xen toolstack.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Gerd Hoffmann [Tue, 17 May 2016 08:54:54 +0000 (10:54 +0200)]
vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression. The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.
This patch introduces a new sr_vbe register set. The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[]. Normal vga register reads and
writes go to sr[]. Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.
This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.
vga: make sure vga register setup for vbe stays intact (CVE-2016-3712).
Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode. This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.
Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.
Which is good for one or another buffer overflow. Not that
critical as they typically read overflows happening somewhere
in the display code. So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.
The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.
Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.
Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.
Wei Liu [Tue, 12 Apr 2016 10:43:14 +0000 (11:43 +0100)]
xenfb: use the correct condition to avoid excessive looping
In commit ac0487e1 ("xenfb.c: avoid expensive loops when prod <=
out_cons"), ">=" was used. In fact, a full ring is a legit state.
Correct the test to use ">".
Don Slutz [Mon, 30 Nov 2015 22:11:04 +0000 (17:11 -0500)]
exec: Stop using memory after free
memory_region_unref(mr) can free memory.
For example I got:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f43280d4700 (LWP 4462)]
0x00007f43323283c0 in phys_section_destroy (mr=0x7f43259468b0)
at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
1023 if (mr->subpage) {
(gdb) bt
at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
at /home/don/xen/tools/qemu-xen-dir/exec.c:1034
at /home/don/xen/tools/qemu-xen-dir/exec.c:2205
(gdb) p mr
$1 = (MemoryRegion *) 0x7f43259468b0
And this change prevents this.
Signed-off-by: Don Slutz <Don.Slutz@Gmail.com>
Message-Id: <1448921464-21845-1-git-send-email-Don.Slutz@Gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On Xen MSIs can be remapped into pirqs, which are a type of event
channels. It's mostly for the benefit of PCI passthrough devices, to
avoid the overhead of interacting with the emulated lapic.
However remapping interrupts and MSIs is also supported for emulated
devices, such as the e1000 and virtio-net.
When an interrupt or an MSI is remapped into a pirq, masking and
unmasking is done by masking and unmasking the event channel. The
masking bit on the PCI config space or MSI-X table should be ignored,
but it isn't at the moment.
As a consequence emulated devices which use MSI or MSI-X, such as
virtio-net, don't work properly (the guest doesn't receive any
notifications). The mechanism was working properly when xen_apic was
introduced, but I haven't narrowed down which commit in particular is
causing the regression.
Fix the issue by ignoring the masking bit for MSI and MSI-X which have
been remapped into pirqs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
net: pcnet: add check to validate receive data size(CVE-2015-7504)
In loopback mode, pcnet_receive routine appends CRC code to the
receive buffer. If the data size given is same as the buffer size,
the appended CRC code overwrites 4 bytes after s->buffer. Added a
check to avoid that.
Reported by: Qinghao Tang <luodalongde@gmail.com> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Jason Wang [Mon, 30 Nov 2015 07:00:06 +0000 (15:00 +0800)]
pcnet: fix rx buffer overflow(CVE-2015-7512)
Backends could provide a packet whose length is greater than buffer
size. Check for this and truncate the packet to avoid rx buffer
overflow in this case.
Cc: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Gerd Hoffmann [Mon, 14 Dec 2015 08:21:23 +0000 (09:21 +0100)]
ehci: make idt processing more robust
Make ehci_process_itd return an error in case we didn't do any actual
iso transfer because we've found no active transaction. That'll avoid
ehci happily run in circles forever if the guest builds a loop out of
idts.
This is CVE-2015-8558.
Cc: qemu-stable@nongnu.org Reported-by: Qinghao Tang <luodalongde@gmail.com> Tested-by: P J P <ppandit@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
While checking r/w access in 'memory_access_is_direct' routine
a glitch in the expression leads to segmentation fault while
performing dma read operation.
net: cadence_gem: check packet size in gem_recieve
While receiving packets in 'gem_receive' routine, if Frame Check
Sequence(FCS) is enabled, it copies the packet into a local
buffer without checking its size. Add check to validate packet
length against the buffer size to avoid buffer overflow.
Reported-by: Ling Liu <liuling-it@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
When processing NCQ commands, AHCI device emulation prepares a
NCQ transfer object; To which an aio control block(aiocb) object
is assigned in 'execute_ncq_command'. In case, when the NCQ
command is invalid, the 'aiocb' object is not assigned, and NCQ
transfer object is left as 'used'. This leads to a use after
free kind of error in 'bdrv_aio_cancel_async' via 'ahci_reset_port'.
Reset NCQ transfer object to 'unused' to avoid it.
[Maintainer edit: s/ACHI/AHCI/ in the commit message. --js]
Reported-by: Qinghao Tang <luodalongde@gmail.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1452282511-4116-1-git-send-email-ppandit@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
net: ne2000: fix bounds check in ioport operations
While doing ioport r/w operations, ne2000 device emulation suffers
from OOB r/w errors. Update respective array bounds check to avoid
OOB access.
Reported-by: Ling Liu <liuling-it@360.cn> Cc: qemu-stable@nongnu.org Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
While processing transmit(tx) descriptors in 'tx_consume' routine
the switch emulator suffers from an off-by-one error, if a
descriptor was to have more than allowed(ROCKER_TX_FRAGS_MAX=16)
fragments. Fix an incorrect bounds check to avoid it.
Reported-by: Qinghao Tang <luodalongde@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
P J P [Mon, 21 Dec 2015 09:43:13 +0000 (15:13 +0530)]
scsi: initialise info object with appropriate size
While processing controller 'CTRL_GET_INFO' command, the routine
'megasas_ctrl_get_info' overflows the '&info' object size. Use its
appropriate size to null initialise it.
Reported-by: Qinghao Tang <luodalongde@gmail.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512211501420.22471@wniryva> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: P J P <ppandit@redhat.com>
P J P [Tue, 15 Dec 2015 06:57:54 +0000 (12:27 +0530)]
net: vmxnet3: avoid memory leakage in activate_device
Vmxnet3 device emulator does not check if the device is active
before activating it, also it did not free the transmit & receive
buffers while deactivating the device, thus resulting in memory
leakage on the host. This patch fixes both these issues to avoid
host memory leakage.
Reported-by: Qinghao Tang <luodalongde@gmail.com> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com>
While sending 'SetPixelFormat' messages to a VNC server,
the client could set the 'red-max', 'green-max' and 'blue-max'
values to be zero. This leads to a floating point exception in
write_png_palette while doing frame buffer updates.
Reported-by: Lian Yihan <lianyihan@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Laszlo Ersek [Tue, 19 Jan 2016 13:17:20 +0000 (14:17 +0100)]
e1000: eliminate infinite loops on out-of-bounds transfer start
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:
- the TDLEN and RDLEN registers store the total size of the descriptor
area,
- while the TDH and RDH registers store the offset (in whole tx / rx
descriptors) into the area where the transfer is supposed to start.
Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).
QEMU already contains logic to deal with bogus transfers submitted by the
guest:
- Normally, the transmit case wants to increase TDH from its initial value
to TDT. (TDT is allowed to be numerically smaller than the initial TDH
value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
that QEMU currently has here is a check against reaching the original
TDH value again -- a complete wraparound, which should never happen.
- In the receive case RDH is increased from its initial value until
"total_size" bytes have been received; preferably in a single step, or
in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
RX descriptors are skipped without receiving data, while RDH is
incremented just the same. QEMU tries to prevent an infinite loop
(processing only null RX descriptors) by detecting whether RDH assumes
its original value during the loop. (Again, wrapping from RDLEN to 0 is
normal.)
What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.
The condition that expresses this is:
xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)
i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.
This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.
Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Prasad Pandit <ppandit@redhat.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xenfb.c: avoid expensive loops when prod <= out_cons
If the frontend sets out_cons to a value higher than out_prod, it will
cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
better checks at the beginning of the function.
Jan Beulich [Wed, 9 Dec 2015 15:46:57 +0000 (15:46 +0000)]
xen/MSI-X: really enforce alignment
The way the generic infrastructure works the intention of not allowing
unaligned accesses can't be achieved by simply setting .unaligned to
false. The benefit is that we can now replace the conditionals in
{get,set}_entry_value() by assert()-s.
Jan Beulich [Wed, 9 Dec 2015 15:45:29 +0000 (15:45 +0000)]
xen/MSI-X: latch MSI-X table writes
The remaining log message in pci_msix_write() is wrong, as there guest
behavior may only appear to be wrong: For one, the old logic didn't
take the mask-all bit into account. And then this shouldn't depend on
host device state (i.e. the host may have masked the entry without the
guest having done so). Plus these writes shouldn't be dropped even when
an entry gets unmasked. Instead, if they can't be made take effect
right away, they should take effect on the next unmasking or enabling
operation - the specification explicitly describes such caching
behavior.
The Xen toolstack uses "vhd" to specify a disk in VHD format, however
the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so
that QEMU can find the right driver to use for it.
xenfb: avoid reading twice the same fields from the shared page
Reading twice the same field could give the guest an attack of
opportunity. In the case of event->type, gcc could compile the switch
statement into a jump table, effectively ending up reading the type
field multiple times.
xen/blkif: Avoid double access to src->nr_segments
src is stored in shared memory and src->nr_segments is dereferenced
twice at the end of the function. If a compiler decides to compile this
into two separate memory accesses then the size limitation could be
bypassed.
Fix it by removing the double access to src->nr_segments.
xen/pt: Don't slurp wholesale the PCI configuration registers
Instead we have the emulation registers ->init functions which
consult the host values to see what the initial value should be
and they are responsible for populating the dev.config.
xen/pt: Check for return values for xen_host_pci_[get|set] in init
and if we have failures we call xen_pt_destroy introduced in
'xen/pt: Move bulk of xen_pt_unregister_device in its own routine.'
and free all of the allocated structures.
To deal with xen_host_pci_[set|get]_ functions returning error values
and clearing ourselves in the init function we should make the
.exit (xen_pt_unregister_device) function be idempotent in case
the generic code starts calling .exit (or for fun does it before
calling .init!).
We do not want to have two entries to cache the guest configuration
registers: XenPTReg->data and dev.config. Instead we want to use
only the dev.config.
To do without much complications we rip out the ->data field
and replace it with an pointer to the dev.config. This way we
have the type-checking (uint8_t, uint16_t, etc) and as well
and pre-computed location.
Alternatively we could compute the offset in dev.config by
using the XenPTRRegInfo and XenPTRegGroup every time but
this way we have the pre-computed values.
This change also exposes some mis-use:
- In 'xen_pt_status_reg_init' we used u32 for the Capabilities Pointer
register, but said register is an an u16.
- In 'xen_pt_msgdata_reg_write' we used u32 but should have only use u16.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xen/pt: Check if reg->init function sets the 'data' past the reg->size
It should never happen, but in case it does (an developer adds
a new register and the 'init_val' expands past the register
size) we want to report. The code will only write up to
reg->size so there is no runtime danger of the register spilling
across other ones - however to catch this sort of thing
we still return an error.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
For a passthrough device we maintain a state of emulated
registers value contained within d->config. We also consult
the host registers (and apply ro and write masks) whenever
the guest access the registers. This is done in xen_pt_pci_write_config
and xen_pt_pci_read_config.
Also in this picture we call pci_default_write_config which
updates the d->config and if the d->config[PCI_COMMAND] register
has PCI_COMMAND_MEMORY (or PCI_COMMAND_IO) acts on those changes.
On startup the d->config[PCI_COMMAND] are the host values, not
what the guest initial values should be, which is exactly what
we do _not_ want to do for 64-bit BARs when the guest just wants
to read the size of the BAR. Huh you say?
To get the size of 64-bit memory space BARs, the guest has
to calculate ((BAR[x] & 0xFFFFFFF0) + ((BAR[x+1] & 0xFFFFFFFF) << 32))
which means it has to do two writes of ~0 to BARx and BARx+1.
prior to this patch and with XSA120-addendum patch (Linux kernel)
the PCI_COMMAND register is copied from the host it can have
PCI_COMMAND_MEMORY bit set which means that QEMU will try to
update the hypervisor's P2M with BARx+1 value to ~0 (0xffffffff)
(to sync the guest state to host) instead of just having
xen_pt_pci_write_config and xen_pt_bar_reg_write apply the
proper masks and return the size to the guest.
To thwart this, this patch syncs up the host values with the
guest values taking into account the emu_mask (bit set means
we emulate, PCI_COMMAND_MEMORY and PCI_COMMAND_IO are set).
That is we copy the host values - masking out any bits which
we will emulate. Then merge it with the initial emulation register
values. Lastly this value is then copied both in
dev.config _and_ XenPTReg->data field.
There is also reg->size accounting taken into consideration
that ends up being used in patch.
xen/pt: Check if reg->init function sets the 'data' past the reg->size
[The DEBUG is to illustate what the hvmloader was doing]
Also we swap from xen_host_pci_long to using xen_host_pci_get_[byte,word,long].
Otherwise we get:
xen_pt_config_reg_init: Offset 0x0004 mismatch! Emulated=0x0000, host=0x2300017, syncing to 0x2300014.
xen_pt_config_reg_init: Error: Offset 0x0004:0x2300014 expands past register size(2)!
which is not surprising. We read the value as an 32-bit (from host),
then operate it as a 16-bit - and the remainder is left unchanged.
We end up writing the value as 16-bit (so 0014) to dev.config
(as we use proper xen_set_host_[byte,word,long] so we don't spill
to other registers) but in XenPTReg->data it is as 32-bit (0x2300014)!
It is harmless as the read/write functions end up using an size mask
and never modify the bits past 16-bit (reg->size is 2).
This patch fixes the warnings by reading the value using the
proper size.
Note that the check for size is still left in-case the developer
sets bits past the reg->size in the ->init routines. The author
tried to fiddle with QEMU_BUILD_BUG to make this work but failed.
xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config
During init time we treat the dev.config area as a cache
of the host view. However during execution time we treat it
as guest view (by the generic PCI API). We need to sync Xen's
code to the generic PCI API view. This is the first step
by replacing all of the code that uses dev.config or
pci_get_[byte|word] to get host value to actually use the
xen_host_pci_get_[byte|word] functions.
Interestingly in 'xen_pt_ptr_reg_init' we also needed to swap
reg_field from uint32_t to uint8_t - since the access is only
for one byte not four bytes. We can split this as a seperate
patch however we would have to use a cast to thwart compiler
warnings in the meantime.
We also truncated 'flags' to 'flag' to make the code fit within
the 80 characters.
xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.
If XEN_PT_LOGGING_ENABLED is enabled the XEN_PT_LOG macros start
using the first argument. Which means if within the function there
is only one user of the argument ('d') and XEN_PT_LOGGING_ENABLED
is not set, we get compiler warnings. This is not the case now
but with the "xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config"
we will hit - so this sync up the function to the rest of them.
xen/pt/msi: Add the register value when printing logging and error messages
We would like to know what the MSI register value is to help
in troubleshooting in the field. As such modify the logging
logic to include such details in xen_pt_msgctrl_reg_write.
xen/pt: xen_host_pci_config_read returns -errno, not -1 on failure
However the init routines assume that on errors the return
code is -1 (as the libxc API is) - while those xen_host_* routines follow
another paradigm - negative errno on return, 0 on success.
Currently IGD drivers always need to access PCH by 1f.0. But we
don't want to poke that directly to get ID, and although in real
world different GPU should have different PCH. But actually the
different PCH DIDs likely map to different PCH SKUs. We do the
same thing for the GPU. For PCH, the different SKUs are going to
be all the same silicon design and implementation, just different
features turn on and off with fuses. The SW interfaces should be
consistent across all SKUs in a given family (eg LPT). But just
same features may not be supported.
Most of these different PCH features probably don't matter to the
Gfx driver, but obviously any difference in display port connections
will so it should be fine with any PCH in case of passthrough.
So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
scenarios, 0x9cc3 for BDW(Broadwell).
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
Now we retrieve VGA bios like kvm stuff in qemu but we need to
fix Device Identification in case if its not matched with the
real IGD device since Seabios is always trying to compare this
ID to work out VGA BIOS.
We will try to reuse assign_dev_load_option_rom in xen side, and
especially its a good beginning to unify pci assign codes both on
kvm and xen in the future.
[Fix build for Windows]
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one. And we also just expose
a minimal real host bridge pci configuration subset.
[Replace pread with lseek and read to fix Windows build]
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
Roger Pau Monne [Fri, 13 Nov 2015 17:38:06 +0000 (17:38 +0000)]
xen: fix usage of xc_domain_create in domain builder
Due to the addition of HVMlite and the requirement to always provide a
valid xc_domain_configuration_t, xc_domain_create now always takes an arch
domain config, which can be NULL in order to mimic previous behaviour.
Add a small stub called xen_domain_create that encapsulates the correct
call to xc_domain_create depending on the libxc version detected.
xen: use errno instead of rc for xc_domain_add_to_physmap
In Xen 4.6 commit cd2f100f0f61b3f333d52d1737dd73f02daee592
"libxc: Fix do_memory_op to return negative value on errors"
made the libxc API less odd-ball: On errors, return value is
-1 and error code is in errno. On success the return value
is either 0 or an positive value.
Since we could be running with an old toolstack in which the
Exx value is in rc or the newer, we add an wrapper around
the xc_domain_add_to_physmap (called xen_xc_domain_add_to_physmap)
which will always return the EXX.
Xen 4.6 did not change the libxc functions mentioned (same parameters)
so we piggyback on the fact that Xen 4.6 has a new function:
commit 504ed2053362381ac01b98db9313454488b7db40 "tools/libxc: Expose
new hypercall xc_reserved_device_memory_map" and check for that.
Lan Tianyu [Sun, 11 Oct 2015 15:19:24 +0000 (23:19 +0800)]
Qemu/Xen: Fix early freeing MSIX MMIO memory region
msix->mmio is added to XenPCIPassthroughState's object as property.
object_finalize_child_property is called for XenPCIPassthroughState's
object, which calls object_property_del_all, which is going to try to
delete msix->mmio. object_finalize_child_property() will access
msix->mmio's obj. But the whole msix struct has already been freed
by xen_pt_msix_delete. This will cause segment fault when msix->mmio
has been overwritten.
spice: surface switch fast path requires same format too.
Commit "555e72f spice: rework mirror allocation, add no-resize fast path"
adds a fast path for surface switches which does't go through the full
primary surface destroy and re-recreation in case the new surface is
identical to the old one (page-flip). It checks the size only though,
but the format must be identical too. This patch adds the format check.
Commit "0002a51 ui/spice: Support shared surface for most pixman
formats" increases the chance to actually trigger this.
Jan Beulich [Fri, 24 Jul 2015 09:38:28 +0000 (03:38 -0600)]
xen/HVM: atomically access pointers in bufioreq handling
The number of slots per page being 511 (i.e. not a power of two) means
that the (32-bit) read and write indexes going beyond 2^32 will likely
disturb operation. The hypervisor side gets I/O req server creation
extended so we can indicate that we're using suitable atomic accesses
where needed, allowing it to atomically canonicalize both pointers when
both have gone through at least one cycle.
The Xen side counterpart (which is not a functional prereq to this
change, albeit a build one) went in already (commit b7007bc6f9).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Pavel Butsykin [Mon, 26 Oct 2015 11:42:57 +0000 (14:42 +0300)]
virtio: sync the dataplane vring state to the virtqueue before virtio_save
When creating snapshot with the dataplane enabled, the snapshot file gets
not the actual state of virtqueue, because the current state is stored in
VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync
the dataplane vring state to the virtqueue. The dataplane will resume its
work at the next notify virtqueue.
When snapshot loads with loadvm we get a message:
VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0:
delta 0x15f5
error while loading state for instance 0x0 of device
'0000:00:08.0/virtio-blk'
Error -1 while loading VM state
to reproduce the error I used the following hmp commands:
savevm snap1
loadvm snap1
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1445859777-2982-1-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 10a06fd65f667a972848ebbbcac11bdba931b544) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Filippov [Sun, 19 Jul 2015 07:02:37 +0000 (10:02 +0300)]
target-xtensa: add window overflow check to L32E/S32E
Despite L32E and S32E primary use is for window underflow and overflow
exception handlers they are just normal instructions, and thus need to
check for window overflow.
Cc: qemu-stable@nongnu.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit f822b7e497fa6a662094b491f86441015f363362) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit 5be7d9f1b1452613b95c6ba70b8d7ad3d0797991
vhost-net: tell tap backend about the vnet endianness
makes vhost net always try to set LE - even if that matches the
native endian-ness.
This makes it fail on older kernels on x86 without TUNSETVNETLE support.
To fix, make qemu_set_vnet_le/qemu_set_vnet_be skip the
ioctl if it matches the host endian-ness.
Reported-by: Marcel Apfelbaum <marcel@redhat.com> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
(cherry picked from commit 052bd52fa978d3f04bc476137ad6e1b9a697f9bd) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The test doesn't check that the output makes any sense, only that QEMU
survives. Useful since we've had an astounding number of crash bugs
around there.
In fact, we have a bunch of them right now: a few devices crash or
hang, and some leave dangling pointers behind. The test skips testing
the broken parts. The next commits will fix them up, and drop the
skipping.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1443689999-12182-8-git-send-email-armbru@redhat.com>
(cherry picked from commit 2d1abb850fd15fd6eb75a92290be5f93b2772ec5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
New convenience function hmp() to facilitate use of
human-monitor-command in tests. Use it to simplify its existing uses.
To blend into existing libqtest code, also add qtest_hmpv() and
qtest_hmp(). That, and the egregiously verbose GTK-Doc comment format
make this patch look bigger than it is.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1443689999-12182-7-git-send-email-armbru@redhat.com>
(cherry picked from commit 5fb48d9673b76fc53507a0e717a12968e57d846e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We want to run qom-test for every architecture, without having to
manually add it to every architecture's list of tests. Commit 3687d53
accomplished this by adding it to every architecture's list
automatically.
However, some architectures inherit their tests from others, like this:
For such architectures, we ended up running the (slow!) test twice.
Commit 2b8419c attempted to avoid this by adding the test only when
it's not already present. Works only as long as we consider adding
the test to the architectures on the left hand side *after* the ones
on the right hand side: x86_64 after i386, microblazeel after
microblaze, xtensaeb after xtensa.
Turns out we consider them in $(SYSEMU_TARGET_LIST) order. Defined as
On my machine, this results in the oder xtensa, x86_64, microblazeel,
microblaze, i386. Consequently, qom-test runs twice for microblazeel
and x86_64.
Replace this complex and flawed machinery with a much simpler one: add
generic tests (currently just qom-test) to check-qtest-generic-y
instead of check-qtest-$(target)-y for every target, then run
$(check-qtest-generic-y) for every target.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-Id: <1443689999-12182-5-git-send-email-armbru@redhat.com>
(cherry picked from commit e253c287153c6f3ce4177686ac12c196f9bd8292) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Paolo Bonzini [Thu, 1 Oct 2015 08:59:52 +0000 (10:59 +0200)]
macio: move DBDMA_init from instance_init to realize
DBDMA_init is not idempotent, and calling it from instance_init
breaks a simple object_new/object_unref pair. Work around this,
pending qdev-ification of DBDMA, by moving the call to realize.
Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-4-git-send-email-armbru@redhat.com>
(cherry picked from commit c7104402353bf32ac1d3a276e3619a20e910506b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Paolo Bonzini [Thu, 1 Oct 2015 08:59:51 +0000 (10:59 +0200)]
hw: do not pass NULL to memory_region_init from instance_init
This causes the region to outlive the object, because it attaches the
region to /machine. This is not nice for the "realize" method, but
much worse for "instance_init" because it can cause dangling pointers
after a simple object_new/object_unref pair.
Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-3-git-send-email-armbru@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 81e0ab48dda611e9571dc2e166840205a4208567)
Conflicts:
hw/display/cg3.c
hw/display/tcx.c
* removed context dependencies on &error_fatal/&error_abort
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Paolo Bonzini [Thu, 1 Oct 2015 08:59:50 +0000 (10:59 +0200)]
memory: allow destroying a non-empty MemoryRegion
This is legal; the MemoryRegion will simply unreference all the
existing subregions and possibly bring them down with it as well.
However, it requires a bit of care to avoid an infinite loop.
Finalizing a memory region cannot trigger an address space update,
but memory_region_del_subregion errs on the side of caution and
might trigger a spurious update: avoid that by resetting mr->enabled
first.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-2-git-send-email-armbru@redhat.com>
(cherry picked from commit 2e2b8eb70fdb7dfbec39f3a19b20f9a73f2f813e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The next commit will compile hw/input/virtio-input.c and
hw/input/virtio-input-hid.c even when CONFIG_LINUX is off. These
files include both "include/standard-headers/linux/input.h" and
<windows.h> then. Doesn't work, because both define SW_MAX. We don't
actually use it. Patch input.h to define SW_MAX_ instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444320700-26260-2-git-send-email-armbru@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ac98fa849e834f48e5a64cf4b22218ba4047e142)
Paolo Bonzini [Wed, 16 Sep 2015 15:38:44 +0000 (17:38 +0200)]
trace: remove malloc tracing
The malloc vtable is not supported anymore in glib, because it broke
when constructors called g_malloc. Remove tracing of g_malloc,
g_realloc and g_free calls.
Note that, for systemtap users, glib also provides tracepoints
glib.mem_alloc, glib.mem_free, glib.mem_realloc, glib.slice_alloc
and glib.slice_free.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1442417924-25831-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 98cf48f60aa4999f5b2808569a193a401a390e6a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Jason Wang [Fri, 25 Sep 2015 05:21:30 +0000 (13:21 +0800)]
virtio-net: correctly drop truncated packets
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:
- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated
In order to be consistent with vhost, fix by discarding the descriptor
in this case.
Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0cf33fb6b49a19de32859e2cdc6021334f448fb3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Jason Wang [Fri, 25 Sep 2015 05:21:29 +0000 (13:21 +0800)]
virtio: introduce virtqueue_discard()
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.
Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 29b9f5efd78ae0f9cc02dd169b6e80d2c404bade) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Jason Wang [Fri, 25 Sep 2015 05:21:28 +0000 (13:21 +0800)]
virtio: introduce virtqueue_unmap_sg()
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.
Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Andrew James <andrew.james@hpe.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ce317461573bac12b10d67699b4ddf1f97cf066c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Migration: Generate the completed event only when we complete
The current migration-completed event is generated a bit too early,
which means that an eager libvirt that's ready to go as soon
as it sees the event ends up racing with the actual end of migration.
This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1271145
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com>
xSigned-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ed1f3e0090069dcb9458aa9e450df12bf8eba0b0) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tony Krowiak [Mon, 12 Oct 2015 15:36:21 +0000 (11:36 -0400)]
util/qemu-config: fix missing machine command line options
Commit 0a7cf217 ("util/qemu-config: fix regression of
qmp_query_command_line_options") aimed to restore parsing of global
machine options, but missed two: "aes-key-wrap" and
"dea-key-wrap" (which were present in the initial version of that
patch). Let's add them to the machine_opts again.
Fixes: 0a7cf217 ("util/qemu-config: fix regression of
qmp_query_command_line_options") CC: Marcel Apfelbaum <marcel@redhat.com> CC: qemu-stable@nongnu.org Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1444664181-28023-1-git-send-email-akrowiak@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 5bcfa0c543b42a560673cafd3b5225900ef617e1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
s390x/kvm: Fix vector validity bit in device machine checks
Device hotplugs trigger a crw machine check. All machine checks
have validity bits for certain register types. With vector support
we also have to claim that vector registers are valid.
This is a band-aid suitable for stable. Long term we should
create the full mcic value dynamically depending on the active
features in the kernel interrupt handler.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: qemu-stable@nongnu.org Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 2ab75df38e34fe9bc271b5115ab52114e6e63a89) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The /4 for offset calculation in MMIO writes was happening twice giving
wrong write offsets. Fix.
While touching the code, change the if-else to be a short returning if
and convert the debug message to a GUEST_ERROR, which is more accurate
for this condition.
Cc: qemu-stable@nongnu.org Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c209b0537203c58a051e5d837320335cea23e494) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The reverted commit changed qdev_device_help() to reject abstract
devices and devices that have cannot_instantiate_with_device_add_yet
set, to fix crash bugs like -device x86_64-cpu,help.
Rejecting abstract devices makes sense: they're purely internal, and
the implementation of the help feature can't cope with them.
Rejecting non-pluggable devices makes less sense: even though you
can't use them with -device, the help may still be useful elsewhere,
for instance with -global. This is a regression: -device FOO,help
used to help even for FOO that aren't pluggable.
The previous two commits fixed the crash bug at a lower layer, so
reverting this one is now safe. Fixes the -device FOO,help
regression, except for the broken devices marked
cannot_even_create_with_object_new_yet. For those, the error message
is improved.
Example of a device where the regression is fixed:
$ qemu-system-x86_64 -device PIIX4_PM,help
PIIX4_PM.command_serr_enable=bool (on/off)
PIIX4_PM.multifunction=bool (on/off)
PIIX4_PM.rombar=uint32
PIIX4_PM.romfile=str
PIIX4_PM.addr=int32 (Slot and optional function number, example: 06.0 or 06)
PIIX4_PM.memory-hotplug-support=bool
PIIX4_PM.acpi-pci-hotplug-with-bridge-support=bool
PIIX4_PM.s4_val=uint8
PIIX4_PM.disable_s4=uint8
PIIX4_PM.disable_s3=uint8
PIIX4_PM.smb_io_base=uint32
Example of a device where it isn't fixed:
$ qemu-system-x86_64 -device host-x86_64-cpu,help
Can't list properties of device 'host-x86_64-cpu'
Both failed with "Parameter 'driver' expects pluggable device type"
before.
Cc: qemu-stable@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-11-git-send-email-armbru@redhat.com>
(cherry picked from commit 33fe96833015cf15f4c0aa5bf8d34f60526e0732) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qdev: Protect device-list-properties against broken devices
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.
This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties. Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1. Example reproducer:
Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_destroy_with_object_finalize_yet
to mark them:
* Hang during cleanup (didn't debug, so I can't say why):
"realview_pci", "versatile_pci".
* Dangling pointer in cpus: most CPUs, plus "allwinner-a10", "digic",
"fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
CPUs
* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
"host-powerpc64-cpu", "host-embedded-powerpc-cpu",
"host-powerpc-cpu" (the powerpc ones can't currently reach the
assertion, because the CPUs are only registered when KVM is enabled,
but the assertion is arguably in the wrong place all the same)
Make qmp_device_list_properties() fail cleanly when the device is so
marked. This improves device-list-properties from "crashes, hangs or
leaves dangling pointers behind" to "fails". Not a complete fix, just
a better-than-nothing work-around. In the above reproducer,
device-list-properties now fails with "Can't list properties of device
'pxa2xx-pcmcia'".
This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2. Example reproducer:
Commit 6e99c63 ("net/socket: Drop net_socket_can_send") changed the
semantics around .can_receive for sockets to now require the device to
flush queued pkts when transitioning to a .can_receive=true state. But
it's OK to drop incoming packets when the link is not active.
Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2734a20b8161831ba68c9166014e00522599d1e2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Jason Wang [Fri, 11 Sep 2015 08:01:56 +0000 (16:01 +0800)]
virtio-net: unbreak self announcement and guest offloads after migration
After commit 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make
features 64bit wide"). Device's guest_features was actually set after
vdc->load(). This breaks the assumption that device specific load()
function can check guest_features. For virtio-net, self announcement
and guest offloads won't work after migration.
Fixing this by defer them to virtio_net_load() where guest_features
were guaranteed to be set. Other virtio devices looks fine.
Fixes: 019a3edbb25f1571e876f8af1ce4c55412939e5d
("virtio: make features 64bit wide") Cc: qemu-stable@nongnu.org Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 1f8828ef573c83365b4a87a776daf8bcef1caa21) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cornelia Huck [Mon, 17 Aug 2015 09:48:29 +0000 (11:48 +0200)]
virtio: avoid leading underscores for helpers
Commit ef546f1275f6563e8934dd5e338d29d9f9909ca6 ("virtio: add
feature checking helpers") introduced a helper __virtio_has_feature.
We don't want to use reserved identifiers, though, so let's
rename __virtio_has_feature to virtio_has_feature and virtio_has_feature
to virtio_vdev_has_feature.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 95129d6fc9ead97155627a4ca0cfd37282883658)
* prereq for 1f8828e Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The xscmpodp and xscmpudp instructions only have the AX, BX bits in
there encoding, the lowest bit (usually TX) is marked as an invalid
bit. We therefore can't decode them with GEN_XX2FORM, which decodes
the two lowest bit.
Introduce a new form GEN_XX2FORM, which decodes AX and BX and mark
the lowest bit as invalid.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8f60f8e2e574f341709128ff7637e685fd640254) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>