]> xenbits.xensource.com Git - qemu-xen.git/log
qemu-xen.git
3 years agoxen-mapcache: Avoid entry->lock overflow stable-4.14 staging-4.14 qemu-xen-4.14.5
Ross Lagerwall [Mon, 24 Jan 2022 10:44:50 +0000 (10:44 +0000)]
xen-mapcache: Avoid entry->lock overflow

In some cases, a particular mapcache entry may be mapped 256 times
causing the lock field to wrap to 0. For example, this may happen when
using emulated NVME and the guest submits a large scatter-gather write.
At this point, the entry map be remapped causing QEMU to write the wrong
data or crash (since remap is not atomic).

Avoid this overflow by increasing the lock field to a uint32_t and also
detect it and abort rather than continuing regardless.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20220124104450.152481-1-ross.lagerwall@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit a021a2dd8b790437d27db95774969349632f856a)

3 years agoxen-hvm: Allow disabling buffer_io_timer
Jason Andryuk [Fri, 10 Dec 2021 19:34:34 +0000 (14:34 -0500)]
xen-hvm: Allow disabling buffer_io_timer

commit f37f29d31488 "xen: slightly simplify bufioreq handling" hard
coded setting req.count = 1 during initial field setup before the main
loop.  This missed a subtlety that an early exit from the loop when
there are no ioreqs to process, would have req.count == 0 for the return
value.  handle_buffered_io() would then remove state->buffered_io_timer.
Instead handle_buffered_iopage() is basically always returning true and
handle_buffered_io() always re-setting the timer.

Restore the disabling of the timer by introducing a new handled_ioreq
boolean and use as the return value.  The named variable will more
clearly show the intent of the code.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20211210193434.75566-1-jandryuk@gmail.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 9288e803e61e8d56d1c6c6aa8beb58596fb84ed9)
[perard: fix context]
(cherry picked from commit 29d18e3c0713d3b81e7a6fa05c524786969c612b)

4 years agobuild: -no-pie is no functional linker flag qemu-xen-4.14.2 qemu-xen-4.14.3 qemu-xen-4.14.4
Christian Ehrhardt [Mon, 14 Dec 2020 15:09:38 +0000 (16:09 +0100)]
build: -no-pie is no functional linker flag

Recent binutils changes dropping unsupported options [1] caused a build
issue in regard to the optionroms.

  ld -m elf_i386 -T /<<PKGBUILDDIR>>/pc-bios/optionrom//flat.lds -no-pie \
    -s -o multiboot.img multiboot.o
  ld.bfd: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)

This isn't really a regression in ld.bfd, filing the bug upstream
revealed that this never worked as a ld flag [2] - in fact it seems we
were by accident setting --nmagic).

Since it never had the wanted effect this usage of LDFLAGS_NOPIE, should be
droppable without any effect. This also is the only use-case of LDFLAGS_NOPIE
in .mak, therefore we can also remove it from being added there.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=983d925d
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=27050#c5

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20201214150938.1297512-1-christian.ehrhardt@canonical.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit bbd2d5a8120771ec59b86a80a1f51884e0a26e53)
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agoMerge tag 'v5.0.1' into staging-4.14 qemu-xen-4.14.1
Anthony PERARD [Fri, 16 Oct 2020 10:53:28 +0000 (11:53 +0100)]
Merge tag 'v5.0.1' into staging-4.14

5.0.1

4 years agoUpdate version for 5.0.1 release
Michael Roth [Tue, 15 Sep 2020 16:27:07 +0000 (11:27 -0500)]
Update version for 5.0.1 release

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoriscv: sifive_test: Allow 16-bit writes to memory region
Nathan Chancellor [Tue, 1 Sep 2020 05:58:23 +0000 (22:58 -0700)]
riscv: sifive_test: Allow 16-bit writes to memory region

When shutting down the machine running a mainline Linux kernel, the
following error happens:

$ build/riscv64-softmmu/qemu-system-riscv64 -bios default -M virt \
    -display none -initrd rootfs.cpio -kernel Image -m 512m \
    -nodefaults -serial mon:stdio
...
Requesting system poweroff
[    4.999630] reboot: Power down
sbi_trap_error: hart0: trap handler failed (error -2)
sbi_trap_error: hart0: mcause=0x0000000000000007 mtval=0x0000000000100000
sbi_trap_error: hart0: mepc=0x000000008000d4cc mstatus=0x0000000000001822
sbi_trap_error: hart0: ra=0x000000008000999e sp=0x0000000080015c78
sbi_trap_error: hart0: gp=0xffffffe000e76610 tp=0xffffffe0081b89c0
sbi_trap_error: hart0: s0=0x0000000080015c88 s1=0x0000000000000040
sbi_trap_error: hart0: a0=0x0000000000000000 a1=0x0000000080004024
sbi_trap_error: hart0: a2=0x0000000080004024 a3=0x0000000080004024
sbi_trap_error: hart0: a4=0x0000000000100000 a5=0x0000000000005555
sbi_trap_error: hart0: a6=0x0000000000004024 a7=0x0000000080011158
sbi_trap_error: hart0: s2=0x0000000000000000 s3=0x0000000080016000
sbi_trap_error: hart0: s4=0x0000000000000000 s5=0x0000000000000000
sbi_trap_error: hart0: s6=0x0000000000000001 s7=0x0000000000000000
sbi_trap_error: hart0: s8=0x0000000000000000 s9=0x0000000000000000
sbi_trap_error: hart0: s10=0x0000000000000000 s11=0x0000000000000008
sbi_trap_error: hart0: t0=0x0000000000000000 t1=0x0000000000000000
sbi_trap_error: hart0: t2=0x0000000000000000 t3=0x0000000000000000
sbi_trap_error: hart0: t4=0x0000000000000000 t5=0x0000000000000000
sbi_trap_error: hart0: t6=0x0000000000000000

The kernel does a 16-bit write when powering off the machine, which
was allowed before commit 5d971f9e67 ("memory: Revert "memory: accept
mismatching sizes in memory_region_access_valid""). Make min_access_size
match reality so that the machine can shut down properly now.

Cc: qemu-stable@nongnu.org
Fixes: 88a07990fa ("SiFive RISC-V Test Finisher")
Fixes: 5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200901055822.2721209-1-natechancellor@gmail.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit ab3d207fe89bc0c63739db19e177af49179aa457)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-ccw: fix virtio_set_ind_atomic
Halil Pasic [Tue, 16 Jun 2020 04:50:34 +0000 (06:50 +0200)]
virtio-ccw: fix virtio_set_ind_atomic

The atomic_cmpxchg() loop is broken because we occasionally end up with
old and _old having different values (a legit compiler can generate code
that accessed *ind_addr again to pick up a value for _old instead of
using the value of old that was already fetched according to the
rules of the abstract machine). This means the underlying CS instruction
may use a different old (_old) than the one we intended to use if
atomic_cmpxchg() performed the xchg part.

Let us use volatile to force the rules of the abstract machine for
accesses to *ind_addr. Let us also rewrite the loop so, we that the
new old is used to compute the new desired value if the xchg part
is not performed.

Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
Reported-by: Andre Wild <Andre.Wild1@ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200616045035.51641-2-pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 1a8242f7c3f53341dd66253b142ecd06ce1d2a97)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonvram: Exit QEMU if NVRAM cannot contain all -prom-env data
Greg Kurz [Thu, 13 Aug 2020 23:12:19 +0000 (01:12 +0200)]
nvram: Exit QEMU if NVRAM cannot contain all -prom-env data

Since commit 61f20b9dc5b7 ("spapr_nvram: Pre-initialize the NVRAM to
support the -prom-env parameter"), pseries machines can pre-initialize
the "system" partition in the NVRAM with the data passed to all -prom-env
parameters on the QEMU command line.

In this case it is assumed that all the data fits in 64 KiB, but the user
can easily pass more and crash QEMU:

$ qemu-system-ppc64 -M pseries $(for ((x=0;x<128;x++)); do \
  echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
  done) # this requires ~128 Kib
malloc(): corrupted top size
Aborted (core dumped)

This happens because we don't check if all the prom-env data fits in
the NVRAM and chrp_nvram_set_var() happily memcpy() it passed the
buffer.

This crash affects basically all ppc/ppc64 machine types that use -prom-env:
- pseries (all versions)
- g3beige
- mac99

and also sparc/sparc64 machine types:
- LX
- SPARCClassic
- SPARCbook
- SS-10
- SS-20
- SS-4
- SS-5
- SS-600MP
- Voyager
- sun4u
- sun4v

Add a max_len argument to chrp_nvram_create_system_partition() so that
it can check the available size before writing to memory.

Since NVRAM is populated at machine init, it seems reasonable to consider
this error as fatal. So, instead of reporting an error when we detect that
the NVRAM is too small and adapt all machine types to handle it, we simply
exit QEMU in all cases. This is still better than crashing. If someone
wants another behavior, I guess this can be reworked later.

Tested with:

$ yes q | \
  (for arch in ppc ppc64 sparc sparc64; do \
       echo == $arch ==; \
       qemu=${arch}-softmmu/qemu-system-$arch; \
       for mach in $($qemu -M help | awk '! /^Supported/ { print $1 }'); do \
           echo $mach; \
           $qemu -M $mach -monitor stdio -nodefaults -nographic \
           $(for ((x=0;x<128;x++)); do \
                 echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
             done) >/dev/null; \
        done; echo; \
   done)

Without the patch, affected machine types cause QEMU to report some
memory corruption and crash:

malloc(): corrupted top size

free(): invalid size

*** stack smashing detected ***: terminated

With the patch, QEMU prints the following message and exits:

NVRAM is too small. Try to pass less data to -prom-env

It seems that the conditions for the crash have always existed, but it
affects pseries, the machine type I care for, since commit 61f20b9dc5b7
only.

Fixes: 61f20b9dc5b7 ("spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter")
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1867739
Reported-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159736033937.350502.12402444542194031035.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 37035df51eaabb8d26b71da75b88a1c6727de8fa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years ago9p: null terminate fs driver options list
Prasad J Pandit [Thu, 9 Jul 2020 17:58:48 +0000 (23:28 +0530)]
9p: null terminate fs driver options list

NULL terminate fs driver options' list, validate_opt() looks for
a null entry to terminate the loop.

Fixes: aee7f3ecd8b7 ("fsdev: Error out when unsupported option is passed")
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200709175848.650400-1-ppandit@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 353b5a91ccf2789b85967d19a8795816b8865562)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agousb: fix setup_len init (CVE-2020-14364)
Gerd Hoffmann [Tue, 25 Aug 2020 05:36:36 +0000 (07:36 +0200)]
usb: fix setup_len init (CVE-2020-14364)

Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.

This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.

Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
(cherry picked from commit b946434f2659a182afc17e155be6791ebfb302eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/arm/sbsa-ref: fix typo breaking PCIe IRQs
Graeme Gregory [Fri, 28 Aug 2020 09:02:43 +0000 (10:02 +0100)]
hw/arm/sbsa-ref: fix typo breaking PCIe IRQs

Fixing a typo in a previous patch that translated an "i" to a 1
and therefore breaking the allocation of PCIe interrupts. This was
discovered when virtio-net-pci devices ceased to function correctly.

Cc: qemu-stable@nongnu.org
Fixes: 48ba18e6d3f3 ("hw/arm/sbsa-ref: Simplify by moving the gic in the machine state")
Signed-off-by: Graeme Gregory <graeme@nuviainc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200821083853.356490-1-graeme@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 870f0051b4ada9a361f7454f833432ae8c06c095)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-net: align RSC fields with updated virtio-net header
Yuri Benditovich [Fri, 8 May 2020 12:59:34 +0000 (15:59 +0300)]
virtio-net: align RSC fields with updated virtio-net header

Removal of duplicated RSC definitions. Changing names of the
fields to ones defined in the Linux header.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dd3d85e89123c907be7628957457af3d03e3b85b)
 Conflicts:
hw/net/virtio-net.c
*drop context dep. on 590790297c0
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonbd: Fix large trim/zero requests
Eric Blake [Wed, 22 Jul 2020 21:22:31 +0000 (16:22 -0500)]
nbd: Fix large trim/zero requests

Although qemu as NBD client limits requests to <2G, the NBD protocol
allows clients to send requests almost all the way up to 4G.  But
because our block layer is not yet 64-bit clean, we accidentally wrap
such requests into a negative size, and fail with EIO instead of
performing the intended operation.

The bug is visible in modern systems with something as simple as:

$ qemu-img create -f qcow2 /tmp/image.img 5G
$ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
$ sudo blkdiscard /dev/nbd0

or with user-space only:

$ truncate --size=3G file
$ qemu-nbd -f raw file
$ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'

Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0
on success, this is also a good time to fix our code to a more robust
paradigm that treats all non-negative values as success.

Alas, our iotests do not currently make it easy to add external
dependencies on blkdiscard or nbdsh, so we have to rely on manual
testing for now.

This patch can be reverted when we later improve the overall block
layer to be 64-bit clean, but for now, a minimal fix was deemed less
risky prior to release.

CC: qemu-stable@nongnu.org
Fixes: 1f4d6d18ed
Fixes: 1c6c4bb7f0
Fixes: https://github.com/systemd/systemd/issues/16242
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200722212231.535072-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rework success tests to use >=0]
(cherry picked from commit 890cbccb089db9e646cc1baea3be9dc060e3917b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoiotests/028: Add test for cross-base-EOF reads
Max Reitz [Tue, 28 Jul 2020 12:08:05 +0000 (14:08 +0200)]
iotests/028: Add test for cross-base-EOF reads

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200728120806.265916-3-mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Claudio Fontana <cfontana@suse.de>
(cherry picked from commit ae159450e161b3e1e2c5b815d19632abbbbcd1a1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoblock: Fix bdrv_aligned_p*v() for qiov_offset != 0
Max Reitz [Tue, 28 Jul 2020 12:08:04 +0000 (14:08 +0200)]
block: Fix bdrv_aligned_p*v() for qiov_offset != 0

Since these functions take a @qiov_offset, they must always take it into
account when working with @qiov.  There are a couple of places where
they do not, but they should.

Fixes: 65cd4424b9df03bb5195351c33e04cbbecc0705c
       ("block/io: bdrv_aligned_preadv: use and support qiov_offset")
Fixes: 28c4da28695bdbe04b336b2c9c463876cc3aaa6d
       ("block/io: bdrv_aligned_pwritev: use and support qiov_offset")
Reported-by: Claudio Fontana <cfontana@suse.de>
Reported-by: Bruce Rogers <brogers@suse.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200728120806.265916-2-mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Claudio Fontana <cfontana@suse.de>
Tested-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 134b7dec6ec2d90616d7986afb3b3b7ca7a4c383)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agomigration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
Vladimir Sementsov-Ogievskiy [Mon, 27 Jul 2020 19:42:22 +0000 (22:42 +0300)]
migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking
is wrong as we do not already own the mutex.  Moreover, the adjacent
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-stable@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-8-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit e6ce5e92248be5547daaee3eb6cd226e9820cf7b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoUpdate OpenBIOS images to 7f28286f built from submodule.
Mark Cave-Ayland [Mon, 27 Jul 2020 08:15:42 +0000 (09:15 +0100)]
Update OpenBIOS images to 7f28286f built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 54414d0fb11314ede939ec80238787c5b2079f4e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agolibvhost-user: Report descriptor index on panic
Philippe Mathieu-Daudé [Thu, 23 Jul 2020 17:19:35 +0000 (19:19 +0200)]
libvhost-user: Report descriptor index on panic

We want to report the index of the descriptor,
not its pointer.

Fixes: 7b2e5c65f4 ("contrib: add libvhost-user")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200723171935.18535-1-philmd@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 8fe9805c73c277dc2feeaa83de73d6a58bf23f39)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-pci: Changed vdev to proxy for VirtIO PCI BAR callbacks.
Andrew Melnychenko [Mon, 6 Jul 2020 11:21:23 +0000 (14:21 +0300)]
virtio-pci: Changed vdev to proxy for VirtIO PCI BAR callbacks.

There is an issue when callback may be called with invalid vdev.
It happens on unplug when vdev already deleted and VirtIOPciProxy is not.
So now, callbacks accept proxy device, and vdev retrieved from it.
Technically memio callbacks should be removed during the flatview update,
but memoryregions remain til PCI device(and it's address space) completely deleted.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1716352
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20200706112123.971087-1-andrew@daynix.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ccec7e9603f446fe75c6c563ba335c00cfda6a06)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agointel_iommu: Use correct shift for 256 bits qi descriptor
Liu Yi L [Sat, 4 Jul 2020 08:07:15 +0000 (01:07 -0700)]
intel_iommu: Use correct shift for 256 bits qi descriptor

In chapter 10.4.23 of VT-d spec 3.0, Descriptor Width bit was introduced
in VTD_IQA_REG. Software could set this bit to tell VT-d the QI descriptor
from software would be 256 bits. Accordingly, the VTD_IQH_QH_SHIFT should
be 5 when descriptor size is 256 bits.

This patch adds the DW bit check when deciding the shift used to update
VTD_IQH_REG.

Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Message-Id: <1593850035-35483-1-git-send-email-yi.l.liu@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a4544c45e109ceee87ee8c19baff28be3890d788)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: Replace free page hinting references to 'report' with 'hint'
Alexander Duyck [Mon, 20 Jul 2020 17:51:28 +0000 (10:51 -0700)]
virtio-balloon: Replace free page hinting references to 'report' with 'hint'

Recently a feature named Free Page Reporting was added to the virtio
balloon. In order to avoid any confusion we should drop the use of the word
'report' when referring to Free Page Hinting. So what this patch does is go
through and replace all instances of 'report' with 'hint" when we are
referring to free page hinting.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175128.21935.93927.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3219b42f025d4d7a9c463235e9f937ab38067de3)
 Conflicts:
hw/virtio/virtio-balloon.c
include/hw/virtio/virtio-balloon.h
*drop context deps on 91b867191d and 7483cbbaf8
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agolinux-headers: update against Linux 5.7-rc3
Cornelia Huck [Mon, 27 Apr 2020 10:24:14 +0000 (12:24 +0200)]
linux-headers: update against Linux 5.7-rc3

commit 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c

Reviewed-by: Michael S. Tsirkin <mst@redhat.com> # virtio/vhost parts
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200427102415.10915-3-cohuck@redhat.com>
(cherry picked from commit dc6f8d458a4ccc360723993f31d310d06469f55f)
*dep for 3219b42f02
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: always indicate S_DONE when migration fails
David Hildenbrand [Mon, 29 Jun 2020 08:06:15 +0000 (10:06 +0200)]
virtio-balloon: always indicate S_DONE when migration fails

If something goes wrong during precopy, before stopping the VM, we will
never send a S_DONE indication to the VM, resulting in the hinted pages
not getting released to be used by the guest OS (e.g., Linux).

Easy to reproduce:
1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
2. Cancel migration (e.g., HMP "migrate_cancel")
3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
   no free memory left.

While at it, add similar locking to virtio_balloon_free_page_done() as
done in virtio_balloon_free_page_stop. Locking is still weird, but that
has to be sorted out separately.

There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
comments regarding S_DONE handling.

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200629080615.26022-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit dd8eeb9671fc881e613008bd20035b85fe45383d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: Add locking to prevent possible race when starting hinting
Alexander Duyck [Mon, 20 Jul 2020 17:51:22 +0000 (10:51 -0700)]
virtio-balloon: Add locking to prevent possible race when starting hinting

There is already locking in place when we are stopping free page hinting
but there is not similar protections in place when we start. I can only
assume this was overlooked as in most cases the page hinting should not be
occurring when we are starting the hinting, however there is still a chance
we could be processing hints by the time we get back around to restarting
the hinting so we are better off making sure to protect the state with the
mutex lock rather than just updating the value with no protections.

Based on feedback from Peter Maydell this issue had also been spotted by
Coverity: CID 1430269

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175122.21935.78013.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1a83e0b9c492a0eaeacd6fbb858fc81d04ab9c3e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: Prevent guest from starting a report when we didn't request one
Alexander Duyck [Mon, 20 Jul 2020 17:51:15 +0000 (10:51 -0700)]
virtio-balloon: Prevent guest from starting a report when we didn't request one

Based on code review it appears possible for the driver to force the device
out of a stopped state when hinting by repeating the last ID it was
provided.

Prevent this by only allowing a transition to the start state when we are
in the requested state. This way the driver is only allowed to send one
descriptor that will transition the device into the start state. All others
will leave it in the stop state once it has finished.

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175115.21935.99563.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 20a4da0f23078deeff5ea6d1e12f47d968d7c3c9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoqdev: Fix device_add DRIVER,help to print to monitor
Markus Armbruster [Tue, 14 Jul 2020 16:01:58 +0000 (18:01 +0200)]
qdev: Fix device_add DRIVER,help to print to monitor

Help on device properties gets printed to stdout instead of the
monitor.  If you have the monitor anywhere else, no help for you.
Broken when commit e1043d674d "qdev: use object_property_help()"
accidentally switched from qemu_printf() to printf().  Switch right
back.

Fixes: e1043d674d792ff64aebae1a3eafc08b38a8a085
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200714160202.3121879-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 029afc4e76041e1a320530d97f99122a1b3d5da2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests: tpm: Skip over pcrUpdateCounter byte in result comparison
Stefan Berger [Tue, 7 Jul 2020 20:16:25 +0000 (16:16 -0400)]
tests: tpm: Skip over pcrUpdateCounter byte in result comparison

The TPM 2 code in libtpms was fixed to handle the PCR 'TCB group' according
to the PCClient profile. The change of the PCRs belonging to the 'TCB group'
now affects the pcrUpdateCounter in the TPM2_PCRRead() responses where its
value is now different (typically lower by '1') than what it was before. To
not fail the tests, we skip the comparison of the 14th byte, which
represents the pcrUpdateCounter.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20200707201625.4177419-3-stefanb@linux.vnet.ibm.com
(cherry picked from commit df8a7568932e4c3c930fdfeb228dd72b4bb71a1f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotpm: tpm_spapr: Exit on TPM backend failures
Stefan Berger [Tue, 7 Jul 2020 20:16:24 +0000 (16:16 -0400)]
tpm: tpm_spapr: Exit on TPM backend failures

Exit on TPM backend failures in the same way as the TPM CRB and TIS device
models do. With this change we now get an error report when the backend
did not start up properly:

error: internal error: qemu unexpectedly closed the monitor:
2020-07-07T12:49:28.333928Z qemu-system-ppc64: tpm-emulator: \
  TPM result for CMD_INIT: 0x101 operation failed

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20200707201625.4177419-2-stefanb@linux.vnet.ibm.com
(cherry picked from commit f8b332a1ff107dc014a52eaf9bf547995205f18a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotarget/hppa: Free some temps in do_sub
Richard Henderson [Mon, 20 Jul 2020 17:35:00 +0000 (10:35 -0700)]
target/hppa: Free some temps in do_sub

Two temps allocated but not freed.  Do enough subtractions
within a single TB and one can run out of temps entirely.

Fixes: b2167459ae ("target-hppa: Implement basic arithmetic")
Buglink: https://bugs.launchpad.net/qemu/+bug/1880287
Tested-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200720174039.517902-1-richard.henderson@linaro.org>
(cherry picked from commit 79826f99feb7222b7804058f0b4ace9ee0546361)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/sd/sdcard: Do not switch to ReceivingData if address is invalid
Philippe Mathieu-Daudé [Thu, 4 Jun 2020 17:22:29 +0000 (19:22 +0200)]
hw/sd/sdcard: Do not switch to ReceivingData if address is invalid

Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.3 Data Read

  Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

  4.3.4 Data Write

  Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().

Fixes: CVE-2020-13253
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-6-f4bug@amsat.org>
(cherry picked from commit 790762e5487114341cccc5bffcec4cb3c022c3cd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/sd/sdcard: Update coding style to make checkpatch.pl happy
Philippe Mathieu-Daudé [Mon, 13 Jul 2020 07:27:35 +0000 (09:27 +0200)]
hw/sd/sdcard: Update coding style to make checkpatch.pl happy

To make the next commit easier to review, clean this code first.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200630133912.9428-3-f4bug@amsat.org>
(cherry picked from commit 794d68de2f021a6d3874df41d6bbe8590ec05207)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/sd/sdcard: Do not allow invalid SD card sizes
Philippe Mathieu-Daudé [Tue, 7 Jul 2020 11:02:34 +0000 (13:02 +0200)]
hw/sd/sdcard: Do not allow invalid SD card sizes

QEMU allows to create SD card with unrealistic sizes. This could
work, but some guests (at least Linux) consider sizes that are not
a power of 2 as a firmware bug and fix the card size to the next
power of 2.

While the possibility to use small SD card images has been seen as
a feature, it became a bug with CVE-2020-13253, where the guest is
able to do OOB read/write accesses past the image size end.

In a pair of commits we will fix CVE-2020-13253 as:

    Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
    occurred and no data transfer is performed.

    Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
    occurred and no data transfer is performed.

    WP_VIOLATION errors are not modified: the error bit is set, we
    stay in receive-data state, wait for a stop command. All further
    data transfer is ignored. See the check on sd->card_status at the
    beginning of sd_read_data() and sd_write_data().

While this is the correct behavior, in case QEMU create smaller SD
cards, guests still try to access past the image size end, and QEMU
considers this is an invalid address, thus "all further data transfer
is ignored". This is wrong and make the guest looping until
eventually timeouts.

Fix by not allowing invalid SD card sizes (suggesting the expected
size as a hint):

  $ qemu-system-arm -M orangepi-pc -drive file=rootfs.ext2,if=sd,format=raw
  qemu-system-arm: Invalid SD card size: 60 MiB
  SD card size has to be a power of 2, e.g. 64 MiB.
  You can resize disk images with 'qemu-img resize <imagefile> <new-size>'
  (note that this will lose data if you make the image smaller than it currently is).

Cc: qemu-stable@nongnu.org
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200713183209.26308-8-f4bug@amsat.org>
(cherry picked from commit a9bcedd15a5834ca9ae6c3a97933e85ac7edbd36)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/sd/sdcard: Simplify realize() a bit
Philippe Mathieu-Daudé [Wed, 6 Jun 2018 01:28:51 +0000 (22:28 -0300)]
hw/sd/sdcard: Simplify realize() a bit

We don't need to check if sd->blk is set twice.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-18-f4bug@amsat.org>
(cherry picked from commit 6dd3a164f5b31c703c7d8372841ad3bd6a57de6d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/sd/sdcard: Restrict Class 6 commands to SCSD cards
Philippe Mathieu-Daudé [Wed, 3 Jun 2020 17:59:16 +0000 (19:59 +0200)]
hw/sd/sdcard: Restrict Class 6 commands to SCSD cards

Only SCSD cards support Class 6 (Block Oriented Write Protection)
commands.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.14 Command Functional Difference in Card Capacity Types

  * Write Protected Group

  SDHC and SDXC do not support write-protected groups. Issuing
  CMD28, CMD29 and CMD30 generates the ILLEGAL_COMMAND error.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-7-f4bug@amsat.org>
(cherry picked from commit 9157dd597d293ab7f599f4d96c3fe8a6e07c633d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests/acceptance/boot_linux: Expand SD card image to power of 2
Philippe Mathieu-Daudé [Tue, 7 Jul 2020 13:05:27 +0000 (15:05 +0200)]
tests/acceptance/boot_linux: Expand SD card image to power of 2

In few commits we won't allow SD card images with invalid size
(not aligned to a power of 2). Prepare the tests: add the
pow2ceil() and image_pow2ceil_expand() methods and resize the
images (expanding) of the tests using SD cards.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Message-Id: <20200713183209.26308-5-f4bug@amsat.org>
(cherry picked from commit 6a289a5ba3383e17fb47029720425bef42e424d7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests/acceptance: refactor boot_linux_console test to allow code reuse
Pavel Dovgalyuk [Fri, 29 May 2020 07:04:44 +0000 (10:04 +0300)]
tests/acceptance: refactor boot_linux_console test to allow code reuse

This patch splits code in BootLinuxConsole class into two different
classes to allow reusing it by record/replay tests.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <159073588490.20809.13942096070255577558.stgit@pasha-ThinkPad-X280>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit 12121c496fcc609e23033c4a36399b54f98bcd56)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests/acceptance: refactor boot_linux to allow code reuse
Pavel Dovgalyuk [Fri, 29 May 2020 07:05:31 +0000 (10:05 +0300)]
tests/acceptance: refactor boot_linux to allow code reuse

This patch moves image downloading functions to the separate class to allow
reusing them from record/replay tests.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <159073593167.20809.17582679291556188984.stgit@pasha-ThinkPad-X280>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit 1c80c87c8c2489e4318c93c844aa29bc1d014146)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests/acceptance: allow console interaction with specific VMs
Pavel Dovgalyuk [Fri, 29 May 2020 07:04:39 +0000 (10:04 +0300)]
tests/acceptance: allow console interaction with specific VMs

Console interaction in avocado scripts was possible only with single
default VM.
This patch modifies the function parameters to allow passing a specific
VM as a parameter to interact with it.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <159073587933.20809.5122618715976660635.stgit@pasha-ThinkPad-X280>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit a5ba86d423c2b071894d86c60487f2317c7ffb60)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotests/acceptance/boot_linux: Tag tests using a SD card with 'device:sd'
Philippe Mathieu-Daudé [Mon, 13 Jul 2020 07:45:33 +0000 (09:45 +0200)]
tests/acceptance/boot_linux: Tag tests using a SD card with 'device:sd'

Avocado tags are handy to automatically select tests matching
the tags. Since these tests use a SD card, tag them.

We can run all the tests using a SD card at once with:

  $ avocado --show=app run -t u-boot tests/acceptance/
  $ AVOCADO_ALLOW_LARGE_STORAGE=ok \
    avocado --show=app \
      run -t device:sd tests/acceptance/
  Fetching asset from tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_sd
  Fetching asset from tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic
  Fetching asset from tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_uboot_netbsd9
   (1/3) tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_sd: PASS (19.56 s)
   (2/3) tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic: PASS (49.97 s)
   (3/3) tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_uboot_netbsd9: PASS (20.06 s)
  RESULTS    : PASS 3 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
  JOB TIME   : 90.02 s

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Message-Id: <20200713183209.26308-4-f4bug@amsat.org>
(cherry picked from commit b7dcbf1395da960ec3c313300dc0030674de8cd1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agodocs/orangepi: Add instructions for resizing SD image to power of two
Niek Linnenbank [Sun, 12 Jul 2020 18:37:08 +0000 (20:37 +0200)]
docs/orangepi: Add instructions for resizing SD image to power of two

SD cards need to have a size of a power of two.
Update the Orange Pi machine documentation to include
instructions for resizing downloaded images using the
qemu-img command.

Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200712183708.15450-1-nieklinnenbank@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit 1c2329b5d644bad16e888d095e2021ad682201d9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoqga: Use qemu_get_host_name() instead of g_get_host_name()
Michal Privoznik [Mon, 22 Jun 2020 18:19:36 +0000 (20:19 +0200)]
qga: Use qemu_get_host_name() instead of g_get_host_name()

Problem with g_get_host_name() is that on the first call it saves
the hostname into a global variable and from then on, every
subsequent call returns the saved hostname. Even if the hostname
changes. This doesn't play nicely with guest agent, because if
the hostname is acquired before the guest is set up (e.g. on the
first boot, or before DHCP) we will report old, invalid hostname.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1845127
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 0d3a8f32b1e0eca279da1b0cc793efc7250c3daf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoutil: Introduce qemu_get_host_name()
Michal Privoznik [Mon, 22 Jun 2020 18:19:35 +0000 (20:19 +0200)]
util: Introduce qemu_get_host_name()

This function offers operating system agnostic way to fetch host
name. It is implemented for both POSIX-like and Windows systems.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit e47f4765afcab2b78dfa5b0115abf64d1d49a5d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoqga: fix assert regression on guest-shutdown
Marc-André Lureau [Thu, 4 Jun 2020 09:44:25 +0000 (11:44 +0200)]
qga: fix assert regression on guest-shutdown

Since commit 781f2b3d1e ("qga: process_event() simplification"),
send_response() is called unconditionally, but will assert when "rsp" is
NULL. This may happen with QCO_NO_SUCCESS_RESP commands, such as
"guest-shutdown".

Fixes: 781f2b3d1e5ef389b44016a897fd55e7a780bf35
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 844bd70b5652f30bbace89499f513e3fbbb6457a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agochardev/tcp: Fix error message double free error
lichun [Sun, 21 Jun 2020 21:30:17 +0000 (05:30 +0800)]
chardev/tcp: Fix error message double free error

Errors are already freed by error_report_err, so we only need to call
error_free when that function is not called.

Cc: qemu-stable@nongnu.org
Signed-off-by: lichun <lichun@ruijie.com.cn>
Message-Id: <20200621213017.17978-1-lichun@ruijie.com.cn>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message improved, cc: qemu-stable]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit ed4e0d2ef140aef255d67eec30767e5fcd949f58)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonbd: Avoid off-by-one in long export name truncation
Eric Blake [Mon, 22 Jun 2020 21:03:55 +0000 (16:03 -0500)]
nbd: Avoid off-by-one in long export name truncation

When snprintf returns the same value as the buffer size, the final
byte was truncated to ensure a NUL terminator.  Fortunately, such long
export names are unusual enough, with no real impact other than what
is displayed to the user.

Fixes: 5c86bdf12089
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200622210355.414941-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 00d69986da83a74f6f5731c80f8dd09fde95d19a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agousb/dev-mtp: Fix Error double free after inotify failure
Markus Armbruster [Tue, 30 Jun 2020 09:03:31 +0000 (11:03 +0200)]
usb/dev-mtp: Fix Error double free after inotify failure

error_report_err() frees its first argument.  Freeing it again is
wrong.  Don't.

Fixes: 47287c27d0c367a89f7b2851e23a7f8b2d499dd6
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-7-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 562a558647be6fe43e60f8bf3601e5b6122c0599)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoerror: Use error_reportf_err() where appropriate
Markus Armbruster [Tue, 5 May 2020 10:19:03 +0000 (12:19 +0200)]
error: Use error_reportf_err() where appropriate

Replace

    error_report("...: %s", ..., error_get_pretty(err));

by

    error_reportf_err(err, "...: ", ...);

One of the replaced messages lacked a colon.  Add it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200505101908.6207-6-armbru@redhat.com>
(cherry picked from commit 5217f1887a8041c51495fbd5d3f767d96a242000)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonet/virtio: Fix failover_replug_primary() return value regression
Markus Armbruster [Tue, 30 Jun 2020 09:03:26 +0000 (11:03 +0200)]
net/virtio: Fix failover_replug_primary() return value regression

Commit 150ab54aa6 "net/virtio: fix re-plugging of primary device"
fixed failover_replug_primary() to return false on failure.  Commit
5a0948d36c "net/virtio: Fix failover error handling crash bugs" broke
it again for hotplug_handler_plug() failure.  Unbreak it.

Commit 5a0948d36c4cbc1c5534afac6fee99de55245d12

Fixes: 5a0948d36c4cbc1c5534afac6fee99de55245d12
Cc: Jens Freimann <jfreimann@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200630090351.1247703-2-armbru@redhat.com>
(cherry picked from commit ca72efccbe33373810341a0d8a10f5698b8fbc87)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/audio/gus: Fix registers 32-bit access
Allan Peramaki [Thu, 18 Jun 2020 10:36:23 +0000 (12:36 +0200)]
hw/audio/gus: Fix registers 32-bit access

Fix audio on software that accesses DRAM above 64k via register
peek/poke and some cases when more than 16 voices are used.

Cc: qemu-stable@nongnu.org
Fixes: 135f5ae1974c ("audio: GUSsample is int16_t")
Signed-off-by: Allan Peramaki <aperamak@pp1.inet.fi>
Tested-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200618103623.6031-1-philmd@redhat.com
Message-Id: <20200615201757.16868-1-aperamak@pp1.inet.fi>
[PMD: Removed unrelated style changes]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 586803455b3fa44d949ecd42cd9c87e5a6287aef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtiofsd: Whitelist fchmod
Max Reitz [Mon, 8 Jun 2020 09:31:11 +0000 (11:31 +0200)]
virtiofsd: Whitelist fchmod

lo_setattr() invokes fchmod() in a rarely used code path, so it should
be whitelisted or virtiofsd will crash with EBADSYS.

Said code path can be triggered for example as follows:

On the host, in the shared directory, create a file with the sticky bit
set and a security.capability xattr:
(1) # touch foo
(2) # chmod u+s foo
(3) # setcap '' foo

Then in the guest let some process truncate that file after it has
dropped all of its capabilities (at least CAP_FSETID):

int main(int argc, char *argv[])
{
    capng_setpid(getpid());
    capng_clear(CAPNG_SELECT_BOTH);
    capng_updatev(CAPNG_ADD, CAPNG_PERMITTED | CAPNG_EFFECTIVE, 0);
    capng_apply(CAPNG_SELECT_BOTH);

    ftruncate(open(argv[1], O_RDWR), 0);
}

This will cause the guest kernel to drop the sticky bit (i.e. perform a
mode change) as part of the truncate (where FATTR_FH is set), and that
will cause virtiofsd to invoke fchmod() instead of fchmodat().

(A similar configuration exists further below with futimens() vs.
utimensat(), but the former is not a syscall but just a wrapper for the
latter, so no further whitelisting is required.)

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1842667
Reported-by: Qian Cai <caiqian@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200608093111.14942-1-mreitz@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 63659fe74e76f5c5285466f0c5cfbdca65b3688e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/net/e1000e: Do not abort() on invalid PSRCTL register value
Philippe Mathieu-Daudé [Mon, 25 May 2020 12:23:30 +0000 (14:23 +0200)]
hw/net/e1000e: Do not abort() on invalid PSRCTL register value

libFuzzer found using 'qemu-system-i386 -M q35':

qemu: hardware error: e1000e: PSRCTL.BSIZE0 cannot be zero
CPU #0:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000663
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
==1988== ERROR: libFuzzer: deadly signal
    #6 0x7fae4d3ea894 in __GI_abort (/lib64/libc.so.6+0x22894)
    #7 0x563f4cc59a1d in hw_error (qemu-fuzz-i386+0xe8ca1d)
    #8 0x563f4d7c93f2 in e1000e_set_psrctl (qemu-fuzz-i386+0x19fc3f2)
    #9 0x563f4d7b798f in e1000e_core_write (qemu-fuzz-i386+0x19ea98f)
    #10 0x563f4d7afc46 in e1000e_mmio_write (qemu-fuzz-i386+0x19e2c46)
    #11 0x563f4cc9a0a7 in memory_region_write_accessor (qemu-fuzz-i386+0xecd0a7)
    #12 0x563f4cc99c13 in access_with_adjusted_size (qemu-fuzz-i386+0xeccc13)
    #13 0x563f4cc987b4 in memory_region_dispatch_write (qemu-fuzz-i386+0xecb7b4)

It simply sent the following 2 I/O command to the e1000e
PCI BAR #2 I/O region:

  writew 0x0100 0x0c00 # RCTL =   E1000_RCTL_DTYP_MASK
  writeb 0x2170 0x00   # PSRCTL = 0

2813 static void
2814 e1000e_set_psrctl(E1000ECore *core, int index, uint32_t val)
2815 {
2816     if (core->mac[RCTL] & E1000_RCTL_DTYP_MASK) {
2817
2818         if ((val & E1000_PSRCTL_BSIZE0_MASK) == 0) {
2819             hw_error("e1000e: PSRCTL.BSIZE0 cannot be zero");
2820         }

Instead of calling hw_error() which abort the process (it is
meant for CPU fatal error condition, not for device logging),
log the invalid request with qemu_log_mask(LOG_GUEST_ERROR)
and return, ignoring the request.

Cc: qemu-stable@nongnu.org
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fda43b1204aecd1db158b3255c591d227fbdd629)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/display/artist: Unbreak size mismatch memory accesses
Helge Deller [Sat, 8 Aug 2020 20:29:01 +0000 (22:29 +0200)]
hw/display/artist: Unbreak size mismatch memory accesses

Commit 5d971f9e6725 ("memory: Revert "memory: accept mismatching sizes
in memory_region_access_valid") broke the artist driver in a way that
the dtwm window manager on HP-UX rendered wrong.

Fixes: 5d971f9e6725 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit e0cf02ce680f11893aca9642e76d6ae68b9375af)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoacpi: accept byte and word access to core ACPI registers
Michael Tokarev [Mon, 20 Jul 2020 16:06:27 +0000 (19:06 +0300)]
acpi: accept byte and word access to core ACPI registers

All ISA registers should be accessible as bytes, words or dwords
(if wide enough).  Fix the access constraints for acpi-pm-evt,
acpi-pm-tmr & acpi-cnt registers.

Fixes: 5d971f9e67 (memory: Revert "memory: accept mismatching sizes in memory_region_access_valid")
Fixes: afafe4bbe0 (apci: switch cnt to memory api)
Fixes: 77d58b1e47 (apci: switch timer to memory api)
Fixes: b5a7c024d2 (apci: switch evt to memory api)
Buglink: https://lore.kernel.org/xen-devel/20200630170913.123646-1-anthony.perard@citrix.com/T/
Buglink: https://bugs.debian.org/964793
BugLink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964247
BugLink: https://bugs.launchpad.net/bugs/1886318
Reported-By: Simon John <git@the-jedi.co.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20200720160627.15491-1-mjt@msgid.tls.msk.ru>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit dba04c3488c4699f5afe96f66e448b1d447cf3fb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoxhci: fix valid.max_access_size to access address registers
Laurent Vivier [Tue, 21 Jul 2020 08:33:22 +0000 (10:33 +0200)]
xhci: fix valid.max_access_size to access address registers

QEMU XHCI advertises AC64 (64-bit addressing) but doesn't allow
64-bit mode access in "runtime" and "operational" MemoryRegionOps.

Set the max_access_size based on sizeof(dma_addr_t) as AC64 is set.

XHCI specs:
"If the xHC supports 64-bit addressing (AC64 = ‘1’), then software
should write 64-bit registers using only Qword accesses.  If a
system is incapable of issuing Qword accesses, then writes to the
64-bit address fields shall be performed using 2 Dword accesses;
low Dword-first, high-Dword second.  If the xHC supports 32-bit
addressing (AC64 = ‘0’), then the high Dword of registers containing
64-bit address fields are unused and software should write addresses
using only Dword accesses"

The problem has been detected with SLOF, as linux kernel always accesses
registers using 32-bit access even if AC64 is set and revealed by
5d971f9e6725 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")

Suggested-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20200721083322.90651-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8e67fda2dd6202ccec093fda561107ba14830a17)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohw/riscv: Allow 64 bit access to SiFive CLINT
Alistair Francis [Tue, 30 Jun 2020 20:12:11 +0000 (13:12 -0700)]
hw/riscv: Allow 64 bit access to SiFive CLINT

Commit 5d971f9e672507210e77d020d89e0e89165c8fc9
"memory: Revert "memory: accept mismatching sizes in
memory_region_access_valid"" broke most RISC-V boards as they do 64 bit
accesses to the CLINT and QEMU would trigger a fault. Fix this failure
by allowing 8 byte accesses.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei<zhiwei_liu@c-sky.com>
Message-Id: <122b78825b077e4dfd39b444d3a46fe894a7804c.1593547870.git.alistair.francis@wdc.com>
(cherry picked from commit 70b78d4e71494c90d2ccb40381336bc9b9a22f79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agomemory: Revert "memory: accept mismatching sizes in memory_region_access_valid"
Michael S. Tsirkin [Wed, 10 Jun 2020 13:47:49 +0000 (09:47 -0400)]
memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"

Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.

This is what devices seem to assume.

However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.

Naturally, this breaks a bunch of devices.

Revert to the documented behaviour.

Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd5a ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5d971f9e672507210e77d020d89e0e89165c8fc9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agolibqos: pci-pc: use 32-bit write for EJ register
Paolo Bonzini [Tue, 23 Jun 2020 16:17:59 +0000 (12:17 -0400)]
libqos: pci-pc: use 32-bit write for EJ register

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4b7c06837ae0b1ff56473202a42e7e386f53d6db)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agolibqos: usb-hcd-ehci: use 32-bit write for config register
Paolo Bonzini [Tue, 23 Jun 2020 16:18:24 +0000 (12:18 -0400)]
libqos: usb-hcd-ehci: use 32-bit write for config register

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 89ed83d8b23c11d250c290593cad3ca839d5b053)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agolinux-user/strace.list: fix epoll_create{,1} -strace output
Sergei Trofimovich [Thu, 16 Apr 2020 17:59:57 +0000 (18:59 +0100)]
linux-user/strace.list: fix epoll_create{,1} -strace output

Fix syscall name and parameters priinter.

Before the change:

```
$ alpha-linux-user/qemu-alpha -strace -L /usr/alpha-unknown-linux-gnu/ /tmp/a
...
1274697 %s(%d)(2097152,274903156744,274903156760,274905840712,274877908880,274903235616) = 3
1274697 exit_group(0)
```

After the change:

```
$ alpha-linux-user/qemu-alpha -strace -L /usr/alpha-unknown-linux-gnu/ /tmp/a
...
1273719 epoll_create1(2097152) = 3
1273719 exit_group(0)
```

Fixes: 9cbc0578cb6 ("Improve output of various syscalls")
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Riku Voipio <riku.voipio@iki.fi>
CC: Laurent Vivier <laurent@vivier.eu>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200416175957.1274882-1-slyfox@gentoo.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit fd568660b7ae9b9e45cbb616acc91ae4c065c32d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoaio-posix: disable fdmon-io_uring when GSource is used
Stefan Hajnoczi [Mon, 11 May 2020 18:36:30 +0000 (19:36 +0100)]
aio-posix: disable fdmon-io_uring when GSource is used

The glib event loop does not call fdmon_io_uring_wait() so fd handlers
waiting to be submitted build up in the list. There is no benefit is
using io_uring when the glib GSource is being used, so disable it
instead of implementing a more complex fix.

This fixes a memory leak where AioHandlers would build up and increasing
amounts of CPU time were spent iterating them in aio_pending(). The
symptom is that guests become slow when QEMU is built with io_uring
support.

Buglink: https://bugs.launchpad.net/qemu/+bug/1877716
Fixes: 73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf ("aio-posix: add io_uring fd monitoring implementation")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Message-id: 20200511183630.279750-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ba607ca8bff4d2c2062902f8355657c865ac7c29)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoaio-posix: don't duplicate fd handler deletion in fdmon_io_uring_destroy()
Stefan Hajnoczi [Mon, 11 May 2020 18:36:29 +0000 (19:36 +0100)]
aio-posix: don't duplicate fd handler deletion in fdmon_io_uring_destroy()

The io_uring file descriptor monitoring implementation has an internal
list of fd handlers that are pending submission to io_uring.
fdmon_io_uring_destroy() deletes all fd handlers on the list.

Don't delete fd handlers directly in fdmon_io_uring_destroy() for two
reasons:
1. This duplicates the aio-posix.c AioHandler deletion code and could
   become outdated if the struct changes.
2. Only handlers with the FDMON_IO_URING_REMOVE flag set are safe to
   remove. If the flag is not set then something still has a pointer to
   the fd handler. Let aio-posix.c and its user worry about that. In
   practice this isn't an issue because fdmon_io_uring_destroy() is only
   called when shutting down so all users have removed their fd
   handlers, but the next patch will need this!

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Message-id: 20200511183630.279750-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit de137e44f75d9868f5b548638081850f6ac771f2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoKVM: x86: believe what KVM says about WAITPKG
Paolo Bonzini [Tue, 30 Jun 2020 13:49:27 +0000 (09:49 -0400)]
KVM: x86: believe what KVM says about WAITPKG

Currently, QEMU is overriding KVM_GET_SUPPORTED_CPUID's answer for
the WAITPKG bit depending on the "-overcommit cpu-pm" setting.  This is a
bad idea because it does not even check if the host supports it, but it
can be done in x86_cpu_realizefn just like we do for the MONITOR bit.

This patch moves it there, while making it conditional on host
support for the related UMWAIT MSR.

Cc: qemu-stable@nongnu.org
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e1e43813e7908b063938a3d01f172f88f6190c80)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonet: use peer when purging queue in qemu_flush_or_purge_queue_packets()
Jason Wang [Mon, 11 May 2020 04:04:53 +0000 (12:04 +0800)]
net: use peer when purging queue in qemu_flush_or_purge_queue_packets()

The sender of packet will be checked in the qemu_net_queue_purge() but
we use NetClientState not its peer when trying to purge the incoming
queue in qemu_flush_or_purge_packets(). This will trigger the assert
in virtio_net_reset since we can't pass the sender check:

hw/net/virtio-net.c:533: void virtio_net_reset(VirtIODevice *): Assertion
`!virtio_net_get_subqueue(nc)->async_tx.elem' failed.
#9 0x55a33fa31b78 in virtio_net_reset hw/net/virtio-net.c:533:13
#10 0x55a33fc88412 in virtio_reset hw/virtio/virtio.c:1919:9
#11 0x55a341d82764 in virtio_bus_reset hw/virtio/virtio-bus.c:95:9
#12 0x55a341dba2de in virtio_pci_reset hw/virtio/virtio-pci.c:1824:5
#13 0x55a341db3e02 in virtio_pci_common_write hw/virtio/virtio-pci.c:1252:13
#14 0x55a33f62117b in memory_region_write_accessor memory.c:496:5
#15 0x55a33f6205e4 in access_with_adjusted_size memory.c:557:18
#16 0x55a33f61e177 in memory_region_dispatch_write memory.c:1488:16

Reproducer:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg701914.html

Fix by using the peer.

Reported-by: "Alexander Bulekov" <alxndr@bu.edu>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: ca77d85e1dbf9 ("net: complete all queued packets on VM stop")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 5fe19fb81839ea42b592b409f725349cf3c73551)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtiofsd: stay below fs.file-max sysctl value (CVE-2020-10717)
Stefan Hajnoczi [Fri, 1 May 2020 14:06:44 +0000 (15:06 +0100)]
virtiofsd: stay below fs.file-max sysctl value (CVE-2020-10717)

The system-wide fs.file-max sysctl value determines how many files can
be open.  It defaults to a value calculated based on the machine's RAM
size.  Previously virtiofsd would try to set RLIMIT_NOFILE to 1,000,000
and this allowed the FUSE client to exhaust the number of open files
system-wide on Linux hosts with less than 10 GB of RAM!

Take fs.file-max into account when choosing the default RLIMIT_NOFILE
value.

Fixes: CVE-2020-10717
Reported-by: Yuval Avrahami <yavrahami@paloaltonetworks.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200501140644.220940-3-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 8c1d353d107b4fc344e27f2f08ea7fa25de2eea2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtiofsd: add --rlimit-nofile=NUM option
Stefan Hajnoczi [Fri, 1 May 2020 14:06:43 +0000 (15:06 +0100)]
virtiofsd: add --rlimit-nofile=NUM option

Make it possible to specify the RLIMIT_NOFILE on the command-line.
Users running multiple virtiofsd processes should allocate a certain
number to each process so that the system-wide limit can never be
exhausted.

When this option is set to 0 the rlimit is left at its current value.
This is useful when a management tool wants to configure the rlimit
itself.

The default behavior remains unchanged: try to set the limit to
1,000,000 file descriptors if the current rlimit is lower.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200501140644.220940-2-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 6dbb716877728ce4eb51619885ef6ef4ada9565f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoiotests/283: Use consistent size for source and target
Kevin Wolf [Thu, 30 Apr 2020 14:27:52 +0000 (16:27 +0200)]
iotests/283: Use consistent size for source and target

The test case forgot to specify the null-co size for the target node.
When adding a check to backup that both sizes match, this would fail
because of the size mismatch and not the behaviour that the test really
wanted to test.

Fixes: a541fcc27c98b96da187c7d4573f3270f3ddd283
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200430142755.315494-2-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 813cc2545b82409fd504509f0ba2e96fab6edb9e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoFix tulip breakage
Helge Deller [Sun, 26 Apr 2020 10:55:39 +0000 (12:55 +0200)]
Fix tulip breakage

The tulip network driver in a qemu-system-hppa emulation is broken in
the sense that bigger network packages aren't received any longer and
thus even running e.g. "apt update" inside the VM fails.

The breakage was introduced by commit 8ffb7265af ("check frame size and
r/w data length") which added checks to prevent accesses outside of the
rx/tx buffers.

But the new checks were implemented wrong. The variable rx_frame_len
counts backwards, from rx_frame_size down to zero, and the variable len
is never bigger than rx_frame_len, so accesses just can't happen and the
checks are unnecessary.
On the contrary the checks now prevented bigger packages to be moved
into the rx buffers.

This patch reverts the wrong checks and were sucessfully tested with a
qemu-system-hppa emulation.

Fixes: 8ffb7265af ("check frame size and r/w data length")
Buglink: https://bugs.launchpad.net/bugs/1874539
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit d9b69640391618045949f7c500b87fc129f862ed)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoes1370: check total frame count against current frame
Prasad J Pandit [Thu, 14 May 2020 20:06:08 +0000 (01:36 +0530)]
es1370: check total frame count against current frame

A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.

    int cnt = d->frame_cnt >> 16;
    int size = d->frame_cnt & 0xffff;

if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 369ff955a8497988d079c4e3fa1e93c2570c1c69)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoati-vga: check mm_index before recursive call (CVE-2020-13800)
Prasad J Pandit [Thu, 4 Jun 2020 09:08:30 +0000 (14:38 +0530)]
ati-vga: check mm_index before recursive call (CVE-2020-13800)

While accessing VGA registers via ati_mm_read/write routines,
a guest may set 's->regs.mm_index' such that it leads to infinite
recursion. Check mm_index value to avoid such recursion. Log an
error message for wrong values.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Yi Ren <c4tren@gmail.com>
Message-id: 20200604090830.33885-1-ppandit@redhat.com
Suggested-by: BALATON Zoltan <balaton@eik.bme.hu>
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit a98610c429d52db0937c1e48659428929835c455)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoxen/9pfs: yield when there isn't enough room on the ring
Stefano Stabellini [Thu, 21 May 2020 19:26:26 +0000 (12:26 -0700)]
xen/9pfs: yield when there isn't enough room on the ring

Instead of truncating replies, which is problematic, wait until the
client reads more data and frees bytes on the reply ring.

Do that by calling qemu_coroutine_yield(). The corresponding
qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon
receiving the next notification from the client.

We need to be careful to avoid races in case xen_9pfs_bh and the
coroutine are both active at the same time. In xen_9pfs_bh, wait until
either the critical section is over (ring->co == NULL) or until the
coroutine becomes inactive (qemu_coroutine_yield() was called) before
continuing. Then, simply wake up the coroutine if it is inactive.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200521192627.15259-2-sstabellini@kernel.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a4c4d462729466c4756bac8a0a8d77eb63b21ef7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoRevert "9p: init_in_iov_from_pdu can truncate the size"
Stefano Stabellini [Thu, 21 May 2020 19:26:25 +0000 (12:26 -0700)]
Revert "9p: init_in_iov_from_pdu can truncate the size"

This reverts commit 16724a173049ac29c7b5ade741da93a0f46edff7.
It causes https://bugs.launchpad.net/bugs/1877688.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200521192627.15259-1-sstabellini@kernel.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit cf45183b718f02b1369e18c795dc51bc1821245d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoxen-9pfs: Fix log messages of reply errors
Christian Schoenebeck [Thu, 14 May 2020 06:06:43 +0000 (08:06 +0200)]
xen-9pfs: Fix log messages of reply errors

If delivery of some 9pfs response fails for some reason, log the
error message by mentioning the 9P protocol reply type, not by
client's request type. The latter could be misleading that the
error occurred already when handling the request input.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <ad0e5a9b6abde52502aa40b30661d29aebe1590a.1589132512.git.qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 9bbb7e0fe081efff2e41f8517c256b72a284fe9b)
*prereq for cf45183b718
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years ago9pfs: include linux/limits.h for XATTR_SIZE_MAX
Dan Robertson [Mon, 25 May 2020 08:38:03 +0000 (10:38 +0200)]
9pfs: include linux/limits.h for XATTR_SIZE_MAX

linux/limits.h should be included for the XATTR_SIZE_MAX definition used
by v9fs_xattrcreate.

Fixes: 3b79ef2cf488 ("9pfs: limit xattr size in xattrcreate")
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200515203015.7090-2-dan@dlrobertson.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 03556ea920b23c466ce7c1283199033de33ee671)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years ago9pfs: local: ignore O_NOATIME if we don't have permissions
Omar Sandoval [Thu, 14 May 2020 06:06:43 +0000 (08:06 +0200)]
9pfs: local: ignore O_NOATIME if we don't have permissions

QEMU's local 9pfs server passes through O_NOATIME from the client. If
the QEMU process doesn't have permissions to use O_NOATIME (namely, it
does not own the file nor have the CAP_FOWNER capability), the open will
fail. This causes issues when from the client's point of view, it
believes it has permissions to use O_NOATIME (e.g., a process running as
root in the virtual machine). Additionally, overlayfs on Linux opens
files on the lower layer using O_NOATIME, so in this case a 9pfs mount
can't be used as a lower layer for overlayfs (cf.
https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c
and https://github.com/NixOS/nixpkgs/issues/54509).

Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g.,
network filesystems. open(2) notes that O_NOATIME "may not be effective
on all filesystems. One example is NFS, where the server maintains the
access time." This means that we can honor it when possible but fall
back to ignoring it.

Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a5804fcf7b22fc7d1f9ec794dd284c7d504bd16b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoblock: Call attention to truncation of long NBD exports
Eric Blake [Mon, 8 Jun 2020 18:26:38 +0000 (13:26 -0500)]
block: Call attention to truncation of long NBD exports

Commit 93676c88 relaxed our NBD client code to request export names up
to the NBD protocol maximum of 4096 bytes without NUL terminator, even
though the block layer can't store anything longer than 4096 bytes
including NUL terminator for display to the user.  Since this means
there are some export names where we have to truncate things, we can
at least try to make the truncation a bit more obvious for the user.
Note that in spite of the truncated display name, we can still
communicate with an NBD server using such a long export name; this was
deemed nicer than refusing to even connect to such a server (since the
server may not be under our control, and since determining our actual
length limits gets tricky when nbd://host:port/export and
nbd+unix:///export?socket=/path are themselves variable-length
expansions beyond the export name but count towards the block layer
name length).

Reported-by: Xueqiang Wei <xuwei@redhat.com>
Fixes: https://bugzilla.redhat.com/1843684
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200610163741.3745251-3-eblake@redhat.com>
(cherry picked from commit 5c86bdf1208916ece0b87e1151c9b48ee54faa3e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: unref the iothread when unrealizing
David Hildenbrand [Wed, 20 May 2020 10:04:39 +0000 (12:04 +0200)]
virtio-balloon: unref the iothread when unrealizing

We took a reference when realizing, so let's drop that reference when
unrealizing.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 105aef9c9479786d27c1c45c9b0b1fa03dc46be3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: fix free page hinting check on unrealize
David Hildenbrand [Wed, 20 May 2020 10:04:38 +0000 (12:04 +0200)]
virtio-balloon: fix free page hinting check on unrealize

Checking against guest features is wrong. We allocated data structures
based on host features. We can rely on "free_page_bh" as an indicator
whether to un-do stuff instead.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 49b01711b8eb3796c6904c7f85d2431572cfe54f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agovirtio-balloon: fix free page hinting without an iothread
David Hildenbrand [Wed, 20 May 2020 10:04:37 +0000 (12:04 +0200)]
virtio-balloon: fix free page hinting without an iothread

In case we don't have an iothread, we mark the feature as abscent but
still add the queue. 'free_page_bh' remains set to NULL.

qemu-system-i386 \
        -M microvm \
        -nographic \
        -device virtio-balloon-device,free-page-hint=true \
        -nographic \
        -display none \
        -monitor none \
        -serial none \
        -qtest stdio

Doing a "write 0xc0000e30 0x24
0x030000000300000003000000030000000300000003000000030000000300000003000000"

We will trigger a SEGFAULT. Let's move the check and bail out.

While at it, move the static initializations to instance_init().
free_page_report_status and block_iothread are implicitly set to the
right values (0/false) already, so drop the initialization.

Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 12fc8903a8ee09fb5f642de82699a0b211e1b5a7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonbd/server: Avoid long error message assertions CVE-2020-10761
Eric Blake [Mon, 8 Jun 2020 18:26:37 +0000 (13:26 -0500)]
nbd/server: Avoid long error message assertions CVE-2020-10761

Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096.  At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:

- requesting NBD_OPT_INFO/GO on an export name that is not present
  (commit 0cfae925 in v2.12 echoes the name)

- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
  not present (commit e7b1948d in v2.12 echoes the name)

At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit
93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit.  (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.)  So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server.  As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.

We may later want to further sanitize the user-supplied strings we
place into our error messages, such as scrubbing out control
characters, but that is less important to the CVE fix, so it can be a
later patch to the new nbd_sanitize_name.

Consideration was given to changing the assertion in
nbd_negotiate_send_rep_verr to instead merely log a server error and
truncate the message, to avoid leaving a latent path that could
trigger a future CVE DoS on any new error message.  However, this
merely complicates the code for something that is already (correctly)
flagging coding errors, and now that we are aware of the long message
pitfall, we are less likely to introduce such errors in the future,
which would make such error handling dead code.

Reported-by: Xueqiang Wei <xuwei@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761
Fixes: 93676c88d7
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200610163741.3745251-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 5c4fe018c025740fef4a0a4421e8162db0c3eefd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agonet: Do not include a newline in the id of -nic devices
Thomas Huth [Mon, 18 May 2020 07:43:52 +0000 (09:43 +0200)]
net: Do not include a newline in the id of -nic devices

The '\n' sneaked in by accident here, an "id" string should really
not contain a newline character at the end.

Fixes: 78cd6f7bf6b ('net: Add a new convenience option "--nic" ...')
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200518074352.23125-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 0561dfac082becdd9e89110249a27b309b62aa9f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years ago9p: Lock directory streams with a CoMutex
Greg Kurz [Mon, 25 May 2020 08:38:03 +0000 (10:38 +0200)]
9p: Lock directory streams with a CoMutex

Locking was introduced in QEMU 2.7 to address the deprecation of
readdir_r(3) in glibc 2.24. It turns out that the frontend code is
the worst place to handle a critical section with a pthread mutex:
the code runs in a coroutine on behalf of the QEMU mainloop and then
yields control, waiting for the fsdev backend to process the request
in a worker thread. If the client resends another readdir request for
the same fid before the previous one finally unlocked the mutex, we're
deadlocked.

This never bit us because the linux client serializes readdir requests
for the same fid, but it is quite easy to demonstrate with a custom
client.

A good solution could be to narrow the critical section in the worker
thread code and to return a copy of the dirent to the frontend, but
this causes quite some changes in both 9p.c and codir.c. So, instead
of that, in order for people to easily backport the fix to older QEMU
versions, let's simply use a CoMutex since all the users for this
sit in coroutines.

Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir")
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <158981894794.109297.3530035833368944254.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ed463454efd0ac3042ff772bfe1b1d846dc281a5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoqemu-nbd: Close inherited stderr
Raphael Pour [Fri, 15 May 2020 06:36:07 +0000 (08:36 +0200)]
qemu-nbd: Close inherited stderr

Close inherited stderr of the parent if fork_process is false.
Otherwise no one will close it. (introduced by e6df58a5)

This only affected 'qemu-nbd -c /dev/nbd0'.

Signed-off-by: Raphael Pour <raphael.pour@hetzner.com>
Message-Id: <d8ddc993-9816-836e-a3de-c6edab9d9c49@hetzner.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: Enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 0eaf453ebf6788885fbb5d40426b154ef8805407)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agotarget/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
Richard Henderson [Wed, 13 May 2020 16:32:43 +0000 (09:32 -0700)]
target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*

Must clear the tail for AdvSIMD when SVE is enabled.

Fixes: ca40a6e6e39
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 525d9b6d42844e187211d25b69be8b378785bc24)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agohostmem: don't use mbind() if host-nodes is empty
Igor Mammedov [Thu, 30 Apr 2020 15:46:06 +0000 (11:46 -0400)]
hostmem: don't use mbind() if host-nodes is empty

Since 5.0 QEMU uses hostmem backend for allocating main guest RAM.
The backend however calls mbind() which is typically NOP
in case of default policy/absent host-nodes bitmap.
However when runing in container with black-listed mbind()
syscall, QEMU fails to start with error
 "cannot bind memory to host NUMA nodes: Operation not permitted"
even when user hasn't provided host-nodes to pin to explictly
(which is the case with -m option)

To fix issue, call mbind() only in case when user has provided
host-nodes explicitly (i.e. host_nodes bitmap is not empty).
That should allow to run QEMU in containers with black-listed
mbind() without memory pinning. If QEMU provided memory-pinning
is required user still has to white-list mbind() in container
configuration.

Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200430154606.6421-1-imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 70b6d525dfb51d5e523d568d1139fc051bc223c5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
4 years agoxen: Actually fix build without passthrough qemu-xen-4.14.0
Anthony PERARD [Fri, 19 Jun 2020 10:31:15 +0000 (11:31 +0100)]
xen: Actually fix build without passthrough

Fix typo.

Fixes: acd0c9416d48 ("xen: fix build without pci passthrough")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20200619103115.254127-1-anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b00de3a51fc8e86d1678c57e65d57aa9ed052422)

4 years agoxen: fix build without pci passthrough
Anthony PERARD [Wed, 3 Jun 2020 16:04:42 +0000 (17:04 +0100)]
xen: fix build without pci passthrough

Xen PCI passthrough support may not be available and thus the global
variable "has_igd_gfx_passthru" might be compiled out. Common code
should not access it in that case.

Unfortunately, we can't use CONFIG_XEN_PCI_PASSTHROUGH directly in
xen-common.c so this patch instead move access to the
has_igd_gfx_passthru variable via function and those functions are
also implemented as stubs. The stubs will be used when QEMU is built
without passthrough support.

Now, when one will want to enable igd-passthru via the -machine
property, they will get an error message if QEMU is built without
passthrough support.

Fixes: 46472d82322d0 ('xen: convert "-machine igd-passthru" to an accelerator property')
Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20200603160442.3151170-1-anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit acd0c9416d4846afc541605ee0e75ca163773e6c)
[fix conflict in hw/xen/Makefile.objs]
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
5 years agomain loop: Big hammer to fix logfile disk DoS in Xen setups
Ian Jackson [Thu, 26 May 2016 15:21:56 +0000 (16:21 +0100)]
main loop: Big hammer to fix logfile disk DoS in Xen setups

Each time round the main loop, we now fstat stderr.  If it is too big,
we dup2 /dev/null onto it.  This is not a very pretty patch but it is
very simple, easy to see that it's correct, and has a low risk of
collateral damage.

There is no limit by default but can be adjusted by setting a new
environment variable.

This fixes CVE-2014-3672.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Set the default to 0 so that it won't affect non-xen installation. The
limit will be set by Xen toolstack.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 44a072f0de0d57c95c2212bbce02888832b7b74f)
(cherry picked from commit 269381bb635692856aa8789a3f322e543e0c648d)

5 years agoMerge tag 'v5.0.0' into 'staging'
Anthony PERARD [Wed, 29 Apr 2020 09:19:37 +0000 (10:19 +0100)]
Merge tag 'v5.0.0' into 'staging'

5 years agoUpdate version for v5.0.0 release
Peter Maydell [Tue, 28 Apr 2020 16:46:57 +0000 (17:46 +0100)]
Update version for v5.0.0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agoUpdate version for v5.0.0-rc4 release
Peter Maydell [Wed, 22 Apr 2020 16:51:35 +0000 (17:51 +0100)]
Update version for v5.0.0-rc4 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agotarget/arm: Fix ID_MMFR4 value on AArch64 'max' CPU
Peter Maydell [Wed, 22 Apr 2020 12:45:01 +0000 (13:45 +0100)]
target/arm: Fix ID_MMFR4 value on AArch64 'max' CPU

In commit 41a4bf1feab098da4cd the added code to set the CNP
field in ID_MMFR4 for the AArch64 'max' CPU had a typo
where it used the wrong variable name, resulting in ID_MMFR4
fields AC2, XNX and LSM being wrong. Fix the typo.

Fixes: 41a4bf1feab098da4cd
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200422124501.28015-1-peter.maydell@linaro.org

5 years agoslirp: update to fix CVE-2020-1983
Marc-André Lureau [Tue, 21 Apr 2020 17:02:27 +0000 (19:02 +0200)]
slirp: update to fix CVE-2020-1983

This is an update on the stable-4.2 branch of libslirp.git:

git shortlog 55ab21c9a3..2faae0f778f81

Marc-André Lureau (1):
      Fix use-afte-free in ip_reass() (CVE-2020-1983)

CVE-2020-1983 is actually a follow up fix for commit
126c04acbabd7ad32c2b018fe10dfac2a3bc1210 ("Fix heap overflow in
ip_reass on big packet input") which was was included in qemu
v4.1 (commit e1a4a24d262ba5ac74ea1795adb3ab1cd574c7fb "slirp: update
with CVE-2019-14378 fix").

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20200421170227.843555-1-marcandre.lureau@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agotarget/ppc: Fix TCG temporary leaks in gen_slbia()
Philippe Mathieu-Daudé [Fri, 17 Apr 2020 09:07:49 +0000 (11:07 +0200)]
target/ppc: Fix TCG temporary leaks in gen_slbia()

This fixes:

  $ qemu-system-ppc64 \
  -machine pseries-4.1 -cpu power9 \
  -smp 4 -m 12G -accel tcg ...
  ...
  Quiescing Open Firmware ...
  Booting Linux via __start() @ 0x0000000002000000 ...
  Opcode 1f 12 0f 00 (7ce003e4) leaked temporaries
  Opcode 1f 12 0f 00 (7ce003e4) leaked temporaries
  Opcode 1f 12 0f 00 (7ce003e4) leaked temporaries

[*] https://www.mail-archive.com/qemu-discuss@nongnu.org/msg05400.html

Fixes: 0418bf78fe8 ("Fix ISA v3.0 (POWER9) slbia implementation")
Reported-by: Dennis Clarke <dclarke@blastwave.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20200417090749.14310-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agoMerge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200417' into staging
Peter Maydell [Mon, 20 Apr 2020 18:57:18 +0000 (19:57 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200417' into staging

ppc patch queue for 2020-04-17

Here are a few late bugfixes for qemu-5.0 in the ppc target code.
Unless some really nasty last minute bug shows up, I expect this to be
the last ppc pull request for qemu-5.0.

# gpg: Signature made Fri 17 Apr 2020 06:02:13 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-5.0-20200417:
  target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
  target/ppc: Fix wrong interpretation of the disposition flag.
  linux-user/ppc: Fix padding in mcontext_t for ppc64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agoMerge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request...
Peter Maydell [Mon, 20 Apr 2020 13:43:10 +0000 (14:43 +0100)]
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging

Fix epoll_create1() for qemu-alpha

# gpg: Signature made Thu 16 Apr 2020 16:28:15 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-5.0-pull-request:
  linux-user/syscall.c: add target-to-host mapping for epoll_create1()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agoblock/iscsi:fix heap-buffer-overflow in iscsi_aio_ioctl_cb
Chen Qun [Sat, 18 Apr 2020 06:26:02 +0000 (14:26 +0800)]
block/iscsi:fix heap-buffer-overflow in iscsi_aio_ioctl_cb

There is an overflow, the source 'datain.data[2]' is 100 bytes,
 but the 'ss' is 252 bytes.This may cause a security issue because
 we can access a lot of unrelated memory data.

The len for sbp copy data should take the minimum of mx_sb_len and
 sb_len_wr, not the maximum.

If we use iscsi device for VM backend storage, ASAN show stack:

READ of size 252 at 0xfffd149dcfc4 thread T0
    #0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
    #1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
    #2 0xfffd1af0e2dc  (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
    #3 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #4 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
    #0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
    #1 0xfffd1af0e254  (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
    #2 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #3 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 years agotarget/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
Nicholas Piggin [Tue, 14 Apr 2020 11:11:31 +0000 (21:11 +1000)]
target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts

If mtmsr L=1 sets MSR[EE] while there is a maskable exception pending,
it does not cause an interrupt. This causes the test case to hang:

https://lists.gnu.org/archive/html/qemu-ppc/2019-10/msg00826.html

More recently, Linux reduced the occurance of operations (e.g., rfi)
which stop translation and allow pending interrupts to be processed.
This started causing hangs in Linux boot in long-running kernel tests,
running with '-d int' shows the decrementer stops firing despite DEC
wrapping and MSR[EE]=1.

https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/208301.html

The cause is the broken mtmsr L=1 behaviour, which is contrary to the
architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
Programming Note states:

    If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
    exception is pending, executing an mtmsrd instruction that sets
    MSR[EE] to 1 will cause the interrupt to occur before the next
    instruction is executed, if no higher priority exception exists

Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
bits altered.

The confusion arises from L=0 being "context synchronizing" whereas L=1
is "execution synchronizing", which is a weaker semantic. However this
is not a relaxation of the requirement that these exceptions cause
interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
TCG is doing here), rather it specifies how a pipelined processor can
have multiple instructions in flight where one may influence how another
behaves.

Cc: qemu-stable@nongnu.org
Reported-by: Anton Blanchard <anton@ozlabs.org>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200414111131.465560-1-npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
5 years agotarget/ppc: Fix wrong interpretation of the disposition flag.
Ganesh Goudar [Wed, 8 Apr 2020 17:09:44 +0000 (22:39 +0530)]
target/ppc: Fix wrong interpretation of the disposition flag.

Bitwise AND with kvm_run->flags to evaluate if we recovered from
MCE or not is not correct, As disposition in kvm_run->flags is a
two-bit integer value and not a bit map, So check for equality
instead of bitwise AND.

Without the fix qemu treats any unrecoverable mce error as recoverable
and ends up in a mce loop inside the guest, Below are the MCE logs before
and after the fix.

Before fix:

[   66.775757] MCE: CPU0: Initiator CPU
[   66.775891] MCE: CPU0: Unknown
[   66.776587] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Recovered]
[   66.776857] MCE: CPU0: NIP: [c0080000000e00b8] mcetest_tlbie+0xb0/0x128 [mcetest_tlbie]

After fix:

[ 20.650577] CPU: 0 PID: 1415 Comm: insmod Tainted: G M O 5.6.0-fwnmi-arv+ #11
[ 20.650618] NIP: c0080000023a00e8 LR: c0080000023a00d8 CTR: c000000000021fe0
[ 20.650660] REGS: c0000001fffd3d70 TRAP: 0200 Tainted: G M O (5.6.0-fwnmi-arv+)
[ 20.650708] MSR: 8000000002a0b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42000222 XER: 20040000
[ 20.650758] CFAR: c00000000000b940 DAR: c0080000025e00e0 DSISR: 00000200 IRQMASK: 0
[ 20.650758] GPR00: c0080000023a00d8 c0000001fddd79a0 c0080000023a8500 0000000000000039
[ 20.650758] GPR04: 0000000000000001 0000000000000000 0000000000000000 0000000000000007
[ 20.650758] GPR08: 0000000000000007 c0080000025e00e0 0000000000000000 00000000000000f7
[ 20.650758] GPR12: 0000000000000000 c000000001900000 c00000000101f398 c0080000025c052f
[ 20.650758] GPR16: 00000000000003a8 c0080000025c0000 c0000001fddd7d70 c0000000015b7940
[ 20.650758] GPR20: 000000000000fff1 c000000000f72c28 c0080000025a0988 0000000000000000
[ 20.650758] GPR24: 0000000000000100 c0080000023a05d0 c0000000001f1d70 0000000000000000
[ 20.650758] GPR28: c0000001fde20000 c0000001fd02b2e0 c0080000023a0000 c0080000025e0000
[ 20.651178] NIP [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.651220] LR [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie]
[ 20.651262] Call Trace:
[ 20.651280] [c0000001fddd79a0] [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie] (unreliable)
[ 20.651340] [c0000001fddd7a10] [c00000000001091c] do_one_initcall+0x6c/0x2c0
[ 20.651390] [c0000001fddd7af0] [c0000000001f7998] do_init_module+0x90/0x298
[ 20.651433] [c0000001fddd7b80] [c0000000001f61a8] load_module+0x1f58/0x27a0
[ 20.651476] [c0000001fddd7d40] [c0000000001f6c70] __do_sys_finit_module+0xe0/0x100
[ 20.651526] [c0000001fddd7e20] [c00000000000b9d0] system_call+0x5c/0x68
[ 20.651567] Instruction dump:
[ 20.651594] e8410018 3c620000 e8638020 480000cd e8410018 3c620000 e8638028 480000bd
[ 20.651646] e8410018 7be904e4 39400000 612900e0 <7d434a644bffff74 3c4c0001 38428410
[ 20.651699] ---[ end trace 4c40897f016b4340 ]---
[ 20.653310]
Bus error
[ 20.655575] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Not recovered]
[ 20.655575] MCE: CPU0: NIP: [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.655576] MCE: CPU0: Initiator CPU
[ 20.655576] MCE: CPU0: Unknown

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Message-Id: <20200408170944.16003-1-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
5 years agolinux-user/ppc: Fix padding in mcontext_t for ppc64
Richard Henderson [Tue, 7 Apr 2020 03:21:05 +0000 (20:21 -0700)]
linux-user/ppc: Fix padding in mcontext_t for ppc64

The padding that was added in 95cda4c44ee was added to a union,
and so it had no effect.  This fixes misalignment errors detected
by clang sanitizers for ppc64 and ppc64le.

In addition, only ppc64 allocates space for VSX registers, so do
not save them for ppc32.  The kernel only has references to
CONFIG_SPE in signal_32.c, so do not attempt to save them for ppc64.

Fixes: 95cda4c44ee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200407032105.26711-1-richard.henderson@linaro.org>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
5 years agolinux-user/syscall.c: add target-to-host mapping for epoll_create1()
Sergei Trofimovich [Wed, 15 Apr 2020 22:05:08 +0000 (23:05 +0100)]
linux-user/syscall.c: add target-to-host mapping for epoll_create1()

Noticed by Barnabás Virágh as a python-3.7 failue on qemu-alpha.

The bug shows up on alpha as it's one of the targets where
EPOLL_CLOEXEC differs from other targets:
    sysdeps/unix/sysv/linux/alpha/bits/epoll.h: EPOLL_CLOEXEC  = 01000000
    sysdeps/unix/sysv/linux/bits/epoll.h:        EPOLL_CLOEXEC = 02000000

Bug: https://bugs.gentoo.org/717548
Reported-by: Barnabás Virágh
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Riku Voipio <riku.voipio@iki.fi>
CC: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200415220508.5044-1-slyfox@gentoo.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>