Kevin O'Connor [Mon, 8 Dec 2014 23:10:30 +0000 (18:10 -0500)]
sdhci: Remove class "virtual" methods
The SDHCIClass defines a series of class "methods". However, no code
in the QEMU tree overrides these methods or even uses them outside of
sdhci.c.
Remove the virtual methods and replace them with direct calls to the
underlying functions. This simplifies the process of extending the
sdhci code to support PCI devices (which have a different parent
class).
Signed-off-by: Kevin O'Connor <kevin@koconnor.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 12 Dec 2014 10:54:42 +0000 (11:54 +0100)]
serial: only resample THR interrupt on rising edge of IER.THRI
There is disagreement on whether LSR.THRE should be resampled when
IER.THRI goes from 1 to 1. Bochs only does it if IER.THRI goes from 0
to 1; PCE does it even if IER.THRI is unchanged. But the Windows driver
seems to always go from 1 to 0 and back to 1, so do things in agreement
with Bochs, because the handling of thr_ipending was reported in 2010
(https://lists.gnu.org/archive/html/qemu-devel/2010-03/msg01914.html)
as breaking DR-DOS Plus.
Reported-by: Roy Tam <roytam@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 11 Dec 2014 18:08:14 +0000 (19:08 +0100)]
serial: update LSR on enabling/disabling FIFOs
When the transmit FIFO is emptied or enabled, the transmitter
hold register is empty. When it is disabled, it is also emptied and
in addition the previous contents of the transmitter hold register
are discarded. In either case, the THRE bit in LSR must be set and
THRI raised.
When the receive FIFO is emptied or enabled, the data ready and break
bits must be cleared in LSR. Likewise when the receive FIFO is disabled.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 11 Dec 2014 16:01:39 +0000 (17:01 +0100)]
serial: clean up THRE/TEMT handling
- assert TEMT is cleared before sending a character; we'll get one from
TSR if tsr_retry > 0, from the FIFO or THR otherwise
- assert THRE cleared and FIFO not empty (if enabled) before fetching a
character to send. This effectively reverts dffacd46, but the check
makes no sense and commit f702e62 (serial: change retry logic to avoid
concurrency, 2014-07-11) must have made it unnecessary. The commit
message for f702e62 talks about multiple calls to qemu_chr_fe_add_watch
triggering s->tsr_retry >= MAX_XMIT_RETRY, but other failures were
possible. For example, if you have multiple calls, the subsequent ones
will see s->tsr_retry == 0 and will find THRE and/or TEMT on entry.
- for clarity, raise THRI immediately after the code sets THRE
- check THRE to see if another character has to be sent. This makes
the assertions more obvious and also means TEMT has to be set as soon as
the loop ends. It makes the loop send both TSR and THR if flow-control
happens in non-FIFO mode. Previously, THR would be lost.
- clear TEMT together with THRE even in the non-FIFO case
The last two items are bugfixes, but they were just found by inspection
and do not squash known bugs.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 12 Dec 2014 09:17:08 +0000 (10:17 +0100)]
serial: reset thri_pending on IER writes with THRI=0
This is responsible for failure of migration from 2.2 to 2.1, because
thr_ipending is always one in practice.
serial.c is setting thr_ipending unconditionally. However, thr_ipending
is not used at all if THRI=0, and it will be overwritten again the next
time THRE or THRI changes. For that reason, we can set thr_ipending to
zero every time THRI is reset.
There is disagreement on whether LSR.THRE should be resampled when IER.THRI
goes from 1 to 1. This patch does not touch the code, leaving that for
QEMU 2.3+.
This has no semantic change and is enough to fix migration in the common
case where the interrupt is not pending or is reported in IIR. It does not
change the migration format, so 2.2.0 -> 2.1 will remain broken but we
can fix 2.2.1 -> 2.1 without breaking 2.2.1 <-> 2.2.0.
The case that remains broken (the one in which the subsection is strictly
necessary) is when THRE=1, the THRI interrupt has *not* been acknowledged
yet, and a higher-priority interrupt comes. In this case, you need the
subsection to tell the source that the lower-priority THRI interrupt is
pending. The subsection's breakage of migration, in this case, prevents
continuing the VM on the destination with an invalid state.
Cc: qemu-stable@nongnu.org Reported-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 11 Dec 2014 01:17:03 +0000 (02:17 +0100)]
linuxboot: fix loading old kernels
Old kernels that used high memory only allowed the initrd to be in the
first 896MB of memory. If you load the initrd above, they complain
that "initrd extends beyond end of memory".
In order to fix this, while not breaking machines with small amounts
of memory fixed by cdebec5 (linuxboot: compute initrd loading address,
2014-10-06), we need to distinguish two cases. If pc.c placed the
initrd at end of memory, use the new algorithm based on the e801
memory map. If instead pc.c placed the initrd at the maximum address
specified by the bzImage, leave it there.
The only interesting part is that the low-memory info block is now
loaded very early, in real mode, and thus the 32-bit address has
to be converted into a real mode segment. The initrd address is
also patched in the info block before entering real mode, it is
simpler that way.
This fixes booting the RHEL4.8 32-bit installation image with 1GB
of RAM.
Cc: qemu-stable@nongnu.org Cc: mst@redhat.com Cc: jsnow@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 10 Dec 2014 15:56:46 +0000 (16:56 +0100)]
kvm/apic: fix 2.2->2.1 migration
The wait_for_sipi field is set back to 1 after an INIT, so it was not
effective to reset it in kvm_apic_realize. Introduce a reset callback
and reset wait_for_sipi there.
Reported-by: Igor Mammedov <imammedo@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Wed, 26 Nov 2014 10:39:42 +0000 (13:39 +0300)]
i386: do not cross the pages boundaries in replay mode
This patch denies crossing the boundary of the pages in the replay mode,
because it can cause an exception. Do it only when boundary is
crossed by the first instruction in the block.
If current instruction already crossed the bound - it's ok,
because an exception hasn't stopped this code.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Wed, 26 Nov 2014 10:40:55 +0000 (13:40 +0300)]
cpus: make icount warp behave well with respect to stop/cont
This patch makes icount warp use the new QEMU_CLOCK_VIRTUAL_RT clock.
This way, icount's QEMU_CLOCK_VIRTUAL will never count time during which
the virtual machine is stopped.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Wed, 26 Nov 2014 10:40:50 +0000 (13:40 +0300)]
timer: introduce new QEMU_CLOCK_VIRTUAL_RT clock
This patch introduces new QEMU_CLOCK_VIRTUAL_RT clock, which
should be used for icount warping. In the next patch, it
will be used to avoid a huge icount warp when a virtual
machine is stopped for a long time.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Mon, 8 Dec 2014 07:53:45 +0000 (10:53 +0300)]
icount: introduce cpu_get_icount_raw
Separate accessing the instruction counter from the compensation for
speed and halting that are introduced by qemu_icount_bias. This
introduces new infrastructure used by the record/replay patches.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Wed, 26 Nov 2014 10:39:20 +0000 (13:39 +0300)]
cpu-exec: reset exception_index correctly
Exception index is reset at every entry at every entry into cpu_exec()
function. This may cause missing the exceptions while replaying them.
This patch moves exception_index reset to the locations where they are
processed.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pavel Dovgalyuk [Wed, 26 Nov 2014 10:38:52 +0000 (13:38 +0300)]
cpu-exec: fix cpu_exec_nocache
In icount mode cpu_exec_nocache function is used to execute part of the
existing TB. At the end of cpu_exec_nocache newly created TB is deleted.
Sometimes io_read function needs to recompile current TB and restart TB
lookup and execution. After that tb_find_fast function finds old (bigger)
TB again. This TB cannot be executed (because icount is not big enough)
and cpu_exec_nocache is called again. Such a loop continues over and over.
This patch deletes old TB and avoids finding it in the TB cache.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
scsi: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
coverity/s390x: avoid false positive in kvm_irqchip_add_adapter_route
Paolo Bonzini reported that Coverity reports an uninitialized pad value.
Let's use a designated initializer for kvm_irq_routing_entry to avoid
this false positive. This is similar to kvm_irqchip_add_msi_route and
other users of kvm_irq_routing_entry.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
valgrind/i386: avoid false positives on KVM_SET_MSRS ioctl
struct kvm_msrs contains padding bytes. Let's use a designated
initializer on the info part to avoid false positives from
valgrind/memcheck. Do the same for generic MSRS, the TSC and
feature control.
We also need to zero out the reserved fields in the entries.
We do this in kvm_msr_entry_set as suggested by Paolo. This
avoids a big memset that a designated initializer on the
full structure would do.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
valgrind: avoid false positives in KVM_GET_DIRTY_LOG ioctl
struct kvm_dirty_log contains padding fields that trigger false
positives in valgrind. Let's use a designated initializer to avoid
false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Eric Auger [Fri, 31 Oct 2014 13:38:19 +0000 (13:38 +0000)]
vfio: use kvm_resamplefds_enabled()
Use the kvm_resamplefds_enabled function
Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Eric Auger [Fri, 31 Oct 2014 13:38:18 +0000 (13:38 +0000)]
KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.
Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.
A special notice for s390 where KVM_CAP_IRQFD was not immediatly
advirtised when irqfd capability was introduced in the kernel.
KVM_CAP_IRQ_ROUTING was advertised instead.
This was fixed in "KVM: s390: announce irqfd capability", ebc3226202d5956a5963185222982d435378b899 whereas irqfd support
was brought in 84223598778ba08041f4297fda485df83414d57e,
"KVM: s390: irq routing for adapter interrupts". Both commits
first appear in 3.15 so there should not be any kernel
version impacted by this QEMU modification.
Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Aurelien Jarno [Fri, 20 Jun 2014 22:48:09 +0000 (00:48 +0200)]
target-i386: simplify AES emulation
This patch simplifies the AES code, by directly accessing the newly added
S-Box, InvS-Box and InvMixColumns tables instead of recreating them by
using the AES_Te and AES_Td tables.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Gerd Hoffmann [Tue, 25 Nov 2014 13:54:17 +0000 (14:54 +0100)]
input: move input-send-event into experimental namespace
Ongoing discussions on how we are going to specify the console,
so tag the command as experiental so we can refine things in
the 2.3 development cycle.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1416923657-10614-1-git-send-email-armbru@redhat.com
[Spell out "not a stable API", and x- the QAPI schema, too] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Amos Kong <akong@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 24 Nov 2014 19:31:50 +0000 (19:31 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, misc bugfixes
A bunch of bugfixes for 2.2.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 24 Nov 2014 18:59:47 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
pc: acpi: mark all possible CPUs as enabled in SRAT
pcie: fix improper use of negative value
pcie: fix typo in pcie_cap_deverr_init()
target-i386: move generic memory hotplug methods to DSDTs
acpi-build: mark RAM dirty on table update
hw/pci: fix crash on shpc error flow
pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
pc: explicitly check maxmem limit when adding DIMM
pc: pc-dimm: use backend alignment during address auto allocation
pc: align DIMM's address/size by backend's alignment value
memory: expose alignment used for allocating RAM as MemoryRegion API
pc: limit DIMM address and size to page aligned values
pc: make pc_dimm_plug() more readble
pc: kvm: check if KVM has free memory slots to avoid abort()
qemu-char: fix tcp_get_fds
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Igor Mammedov [Mon, 10 Nov 2014 16:20:50 +0000 (16:20 +0000)]
pc: acpi: mark all possible CPUs as enabled in SRAT
If QEMU is started with -numa ... Windows only notices that
CPU has been hot-added but it will not online such CPUs.
It's caused by the fact that possible CPUs are flagged as
not enabled in SRAT and Windows honoring that information
doesn't use corresponding CPU.
ACPI 5.0 Spec regarding to flag says:
"
Table 5-47 Local APIC Flags
...
Enabled: if zero, this processor is unusable, and the operating system
support will not attempt to use it.
"
Fix QEMU to adhere to spec and mark possible CPUs as enabled
in SRAT.
With that Windows onlines hot-added CPUs as expected.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi build modifies internal FW CFG RAM on first access
but we forgot to mark it dirty.
If this RAM has been migrated already, it won't be
migrated again, returning corrupted tables to guest.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the pci bridge enters in error flow as part
of init process it will only delete the shpc mmio
subregion but not remove it from the properties list,
resulting in segmentation fault when the bridge runs
the exit function.
Example: add a pci bridge without specifing the chassis number:
<qemu-bin> ... -device pci-bridge,id=p1
Result:
(qemu) qemu-system-x86_64: -device pci-bridge,id=p1: Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
qemu-system-x86_64: -device pci-bridge,id=p1: Device
initialization failed.
Segmentation fault (core dumped)
if (child->class->unparent) {
#0 0x00005555558d629b in object_finalize_child_property (obj=0x555556d2e830, name=0x555556d30630 "shpc-mmio[0]", opaque=0x555556a42fc8) at qom/object.c:1078
#1 0x00005555558d4b1f in object_property_del_all (obj=0x555556d2e830) at qom/object.c:367
#2 0x00005555558d4ca1 in object_finalize (data=0x555556d2e830) at qom/object.c:412
#3 0x00005555558d55a1 in object_unref (obj=0x555556d2e830) at qom/object.c:720
#4 0x000055555572c907 in qdev_device_add (opts=0x5555563544f0) at qdev-monitor.c:566
#5 0x0000555555744f16 in device_init_func (opts=0x5555563544f0, opaque=0x0) at vl.c:2213
#6 0x00005555559cf5f0 in qemu_opts_foreach (list=0x555555e0f8e0 <qemu_device_opts>, func=0x555555744efa <device_init_func>, opaque=0x0, abort_on_failure=1) at util/qemu-option.c:1057
#7 0x000055555574a11b in main (argc=16, argv=0x7fffffffdde8, envp=0x7fffffffde70) at vl.c:423
Unparent the shpc mmio region as part of shpc cleanup.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amos Kong <akong@redhat.com>
Igor Mammedov [Fri, 31 Oct 2014 16:38:42 +0000 (16:38 +0000)]
pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
if DIMMs with different size/alignment are interleaved
in creation order, it could lead to hotplug-memory
container fragmentation and following inability to use
all RAM upto maxmem.
For example:
-m 4G,slots=3,maxmem=7G
-object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
-device pc-dimm,id=mem1,memdev=mem-1
-object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
-device pc-dimm,id=mem2,memdev=mem-2
-object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
-device pc-dimm,id=mem3,memdev=mem-3
fragments hotplug-memory container and doesn't allow
to use 1GB hugepage backend to consume remainig 1Gb.
To ease managment factor count in max 1Gb alignment for
each memory slot when sizing hotplug-memory region so
that regadless of fragmentaion it would be possible to
add max aligned DIMM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Fri, 31 Oct 2014 16:38:41 +0000 (16:38 +0000)]
pc: explicitly check maxmem limit when adding DIMM
Currently maxmem limit is not checked and depends on
hotplug region container not being able to fit more RAM
than maxmem. Do check explicitly so that it would
be possible to change hotplug container size later
to deal with fragmentation.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Maydell [Mon, 24 Nov 2014 13:50:22 +0000 (13:50 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Three patches to fix ExtINT for the QEMU implementation of the local APIC.
# gpg: Signature made Mon 24 Nov 2014 13:38:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
apic: fix incorrect handling of ExtINT interrupts wrt processor priority
apic: fix loss of IPI due to masked ExtINT
apic: avoid getting out of halted state on masked PIC interrupts
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Tue, 11 Nov 2014 12:14:18 +0000 (13:14 +0100)]
apic: fix incorrect handling of ExtINT interrupts wrt processor priority
This fixes another failure with ExtINT, demonstrated by QNX. The failure
mode is as follows:
- IPI sent to cpu 0 (bit set in APIC irr)
- IPI accepted by cpu 0 (bit cleared in irr, set in isr)
- IPI sent to cpu 0 (bit set in both irr and isr)
- PIC interrupt sent to cpu 0
The PIC interrupt causes CPU_INTERRUPT_HARD to be set, but
apic_irq_pending observes that the highest pending APIC interrupt priority
(the IPI) is the same as the processor priority (since the IPI is still
being handled), so apic_get_interrupt returns a spurious interrupt rather
than the pending PIC interrupt. The result is an endless sequence of
spurious interrupts, since nothing will clear CPU_INTERRUPT_HARD.
Instead, ExtINT interrupts should have ignored the processor priority.
Calling apic_check_pic early in apic_get_interrupt ensures that
apic_deliver_pic_intr is called instead of delivering the spurious
interrupt. apic_deliver_pic_intr then clears CPU_INTERRUPT_HARD if needed.
Reported-by: Richard Bilson <rbilson@qnx.com> Tested-by: Richard Bilson <rbilson@qnx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 11 Nov 2014 12:14:14 +0000 (13:14 +0100)]
apic: fix loss of IPI due to masked ExtINT
This patch fixes an obscure failure of the QNX kernel on QEMU x86 SMP.
In QNX, all hardware interrupts come via the PIC, and are delivered by
the cpu 0 LAPIC in ExtINT mode, while IPIs are delivered by the LAPIC
in fixed mode.
This bug happens as follows:
- cpu 0 masks a particular PIC interrupt
- IPI sent to cpu 0 (CPU_INTERRUPT_HARD is set)
- before the IPI is accepted, the masked interrupt line is asserted by the
device
Since the interrupt is masked, apic_deliver_pic_intr will clear
CPU_INTERRUPT_HARD. The IPI will still be set in the APIC irr, but since
CPU_INTERRUPT_HARD is not set the cpu will not notice. Depending on the
scenario this can cause a system hang, i.e. if cpu 0 is expected to unmask
the interrupt.
In order to fix this, do a full check of the APIC before an EXTINT
is acknowledged. This can result in clearing CPU_INTERRUPT_HARD, but
can also result in delivering the lost IPI.
Reported-by: Richard Bilson <rbilson@qnx.com> Tested-by: Richard Bilson <rbilson@qnx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 11 Nov 2014 12:14:05 +0000 (13:14 +0100)]
apic: avoid getting out of halted state on masked PIC interrupts
After the next patch, if a masked PIC interrupts causes CPU_INTERRUPT_POLL
to be set, the CPU will spuriously get out of halted state. While this
is technically valid, we should avoid that.
Make CPU_INTERRUPT_POLL run apic_update_irq in the right thread and then
look at CPU_INTERRUPT_HARD. If CPU_INTERRUPT_HARD does not get set,
do not report the CPU as having work.
Also move the handling of software-disabled APIC from apic_update_irq
to apic_irq_pending, and always trigger CPU_INTERRUPT_POLL. This will
be important once we will add a case that resets CPU_INTERRUPT_HARD
from apic_update_irq. We want to run it even if we go through
CPU_INTERRUPT_POLL, and even if the local APIC is software disabled.
Reported-by: Richard Bilson <rbilson@qnx.com> Tested-by: Richard Bilson <rbilson@qnx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The main reason for reverting this commit before the 2.2 release is that
it adds a QAPI interface that we don't want to keep: The 'nocow' flag
doesn't generally make sense for block nodes, but only for the raw-posix
driver. It should therefore be part of ImageInfoSpecific rather than
ImageInfo.
The commit contains more problems, but unlike the API stability issue
they wouldn't justify reverting it.
Conflicts:
block/qapi.c
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Igor Mammedov [Fri, 31 Oct 2014 16:38:36 +0000 (16:38 +0000)]
pc: limit DIMM address and size to page aligned values
When running in KVM mode, kvm_set_phys_mem() will silently
fail if registered MemoryRegion address/size is not page
aligned. Causing memory hotplug failure in guest.
Mapping non aligned MemoryRegion in TCG mode 'works', but
sane guest OS still expects page aligned memory module
and fails to initialize it if it's not aligned.
So do not allow non aligned (i.e. valid) address/size
values for DIMM to avoid either KVM failure or guest
issues caused by it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Fri, 31 Oct 2014 16:38:32 +0000 (16:38 +0000)]
pc: kvm: check if KVM has free memory slots to avoid abort()
When more memory devices are used than available
KVM memory slots, QEMU crashes with:
kvm_alloc_slot: no free slot available
Aborted (core dumped)
Fix this by checking that KVM has a free slot before
attempting to map memory in guest address space.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Maydell [Fri, 21 Nov 2014 14:15:37 +0000 (14:15 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Fri 21 Nov 2014 11:12:37 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
rtl8139: fix Pointer to local outside scope
pcnet: fix Negative array index read
net/socket: fix Uninitialized scalar variable
net/slirp: fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Gonglei [Thu, 20 Nov 2014 11:35:03 +0000 (19:35 +0800)]
rtl8139: fix Pointer to local outside scope
Coverity spot:
Assigning: iov = struct iovec [3]({{buf, 12UL},
{(void *)dot1q_buf, 4UL},
{buf + 12, size - 12}})
(address of temporary variable of type struct iovec [3]).
out_of_scope: Temporary variable of type struct iovec [3] goes out of scope.
Pointer to local outside scope (RETURN_LOCAL)
use_invalid:
Using iov, which points to an out-of-scope temporary variable of type struct iovec [3].
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Gonglei [Thu, 20 Nov 2014 11:35:02 +0000 (19:35 +0800)]
pcnet: fix Negative array index read
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Gonglei [Thu, 20 Nov 2014 11:35:00 +0000 (19:35 +0800)]
net/slirp: fix memory leak
commit b412eb61 introduce 'cmd:' target for guestfwd,
and fwd don't be used in this scenario, and will leak
memory in true branch with 'cmd:'. Let's allocate memory
for fwd variable just in else statement.
Cc: Alexander Graf <agraf@suse.de> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Leif Lindholm [Wed, 19 Nov 2014 11:08:45 +0000 (11:08 +0000)]
hw/arm/virt: set stdout-path instead of linux,stdout-path
ePAPR 1.1 defines the stdout-path property, making the os-specific
linux,stdout-path property redundant. Change the DT setup for ARM virt
to use the generic property - supported by Linux since 3.15.
The old QEMU behaviour was not present in any released version of
QEMU, and was only added to QEMU after the kernel changed, so
this should not break any existing setups.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
[PMM: add note to commit about the old behaviour never hving been
in a released version of QEMU] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 20 Nov 2014 14:02:24 +0000 (14:02 +0000)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-11-20
Hopefully the last few fixups for 2.2:
- KVM memory slot fix (should usually only occur on PPC)
- e300 fix
- Altivec mtvscr instruction fix
# gpg: Signature made Thu 20 Nov 2014 13:53:34 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg: aka "Alexander Graf <alex@csgraf.de>"
The Move to Vector Status and Control Register (mtvscr) instruction
uses VRB as the source register. Fix the code generator to correctly
decode the VRB field. That is, use "rB(ctx->opcode)" instead of
"rD(ctx->opcode)".
Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Fri, 7 Nov 2014 21:12:48 +0000 (22:12 +0100)]
kvm: Fix memory slot page alignment logic
Memory slots have to be page aligned to get entered into KVM. There
is existing logic that tries to ensure that we pad memory slots that
are not page aligned to the biggest region that would still fit in the
alignment requirements.
Unfortunately, that logic is broken. It tries to calculate the start
offset based on the region size.
Fix up the logic to do the thing it was intended to do and document it
properly in the comment above it.
With this patch applied, I can successfully run an e500 guest with more
than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
Cc: qemu-stable@nongnu.org Signed-off-by: Alexander Graf <agraf@suse.de>
Peter Maydell [Thu, 20 Nov 2014 13:00:28 +0000 (13:00 +0000)]
Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2-2' into staging
Fix from a while back that unfortunately got ignored. Dave Gilbert says
it may actually fix a case where autoconverge would break on a repeat
migration (and not just fix stats).
# gpg: Signature made Thu 20 Nov 2014 12:52:41 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-migration/tags/for-2.2-2:
migration: static variables will not be reset at second migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ChenLiang [Thu, 20 Mar 2014 12:15:03 +0000 (20:15 +0800)]
migration: static variables will not be reset at second migration
The static variables in migration_bitmap_sync will not be reset in
the case of a second attempted migration.
Signed-off-by: ChenLiang <chenliang88@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
Don Slutz [Mon, 17 Nov 2014 21:20:39 +0000 (16:20 -0500)]
hw/ide/core.c: Prevent SIGSEGV during migration
The other callers to blk_set_enable_write_cache() in this file
already check for s->blk == NULL.
Signed-off-by: Don Slutz <dslutz@verizon.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416259239-13281-1-git-send-email-dslutz@verizon.com Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 18 Nov 2014 16:17:32 +0000 (16:17 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Tue 18 Nov 2014 15:04:53 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
net: The third parameter of getsockname should be initialized
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 18 Nov 2014 15:05:36 +0000 (15:05 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 18 Nov 2014 15:04:14 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/tracing-pull-request:
Tracing: Fix simpletrace.py error on tcg enabled binary traces
Tracing docs fix configure option and description
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tracing: Fix simpletrace.py error on tcg enabled binary traces
simpletrace.py does not recognize the tcg option while reading trace-events file. In result simpletrace does not work on binary traces and tcg enabled events. Moved transformation of tcg enabled events to _read_events() which is used by simpletrace.
Signed-off-by: Christoph Seifert <christoph.seifert@posteo.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix the example trace configure option.
Update the text to say that multiple backends are allowed and what
happens when multiple backends are enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1412691161-31785-1-git-send-email-dgilbert@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Peter Maydell [Tue, 18 Nov 2014 13:43:37 +0000 (13:43 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.2.0-rc2
# gpg: Signature made Tue 18 Nov 2014 11:32:55 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream:
block/raw-posix: Catch fsync() errors
block/raw-posix: Only sync after successful preallocation
block/raw-posix: Fix preallocating write() loop
raw-posix: The SEEK_HOLE code is flawed, rewrite it
raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
raw-posix: Fix comment for raw_co_get_block_status()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 18 Nov 2014 12:29:05 +0000 (12:29 +0000)]
Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2' into staging
Fix for CVE-2014-7840, avoiding arbitrary qemu memory overwrite for
migration by Michael S. Tsirkin.
# gpg: Signature made Tue 18 Nov 2014 11:23:00 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-migration/tags/for-2.2:
migration: fix parameter validation on ram load
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.
To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.
Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.
Fixes: CVE-2014-7840
Note: follow-up patches add extra checks on each block->host access.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
Max Reitz [Tue, 18 Nov 2014 10:23:04 +0000 (11:23 +0100)]
block/raw-posix: Fix preallocating write() loop
write() may write less bytes than requested; in this case, the number of
bytes written is returned. This is the byte count we should be
subtracting from the number of bytes still to be written, and not the
byte count we requested to write.
Reported-by: László Érsek <lersek@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter Maydell [Sun, 16 Nov 2014 19:44:21 +0000 (19:44 +0000)]
exec: Handle multipage ranges in invalidate_and_set_dirty()
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.
The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
Kevin Wolf [Tue, 18 Nov 2014 10:01:05 +0000 (11:01 +0100)]
Merge remote-tracking branch 'mreitz/block' into queue-block
* mreitz/block:
raw-posix: The SEEK_HOLE code is flawed, rewrite it
raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
raw-posix: Fix comment for raw_co_get_block_status()
raw-posix: The SEEK_HOLE code is flawed, rewrite it
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris,
but not Linux), try_seek_hole() reports trailing data instead.
Additionally, unlikely lseek() failures are treated badly:
* When SEEK_HOLE fails, try_seek_hole() reports trailing data. For
-ENXIO, there's in fact a trailing hole. Can happen only when
something truncated the file since we opened it.
* When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds,
then try_seek_hole() reports a trailing hole. This is okay only
when SEEK_DATA failed with -ENXIO (which means the non-trailing hole
found by SEEK_HOLE has since become trailing somehow). For other
failures (unlikely), it's wrong.
* When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely),
then try_seek_hole() reports bogus data [-1,start), which its caller
raw_co_get_block_status() turns into zero sectors of data. Could
theoretically lead to infinite loops in code that attempts to scan
data vs. hole forward.
Rewrite from scratch, with very careful comments.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>