David Gibson [Mon, 26 Mar 2018 04:01:22 +0000 (15:01 +1100)]
target/ppc: Add ppc_hash64_filter_pagesizes()
The paravirtualized PAPR platform sometimes needs to restrict the guest to
using only some of the page sizes actually supported by the host's MMU.
At the moment this is handled in KVM specific code, but for consistency we
want to apply the same limitations to all accelerators.
This makes a start on this by providing a helper function in the cpu code
to allow platform code to remove some of the cpu's page size definitions
via a caller supplied callback.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
David Gibson [Wed, 18 Apr 2018 04:21:45 +0000 (14:21 +1000)]
spapr: Use maximum page size capability to simplify memory backend checking
The way we used to handle KVM allowable guest pagesizes for PAPR guests
required some convoluted checking of memory attached to the guest.
The allowable pagesizes advertised to the guest cpus depended on the memory
which was attached at boot, but then we needed to ensure that any memory
later hotplugged didn't change which pagesizes were allowed.
Now that we have an explicit machine option to control the allowable
maximum pagesize we can simplify this. We just check all memory backends
against that declared pagesize. We check base and cold-plugged memory at
reset time, and hotplugged memory at pre_plug() time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
David Gibson [Fri, 16 Mar 2018 08:19:13 +0000 (19:19 +1100)]
spapr: Maximum (HPT) pagesize property
The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous. In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.
At present we handle this by changing what we advertise to the guest based
on the backing pagesizes. This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.
As a start on fixing this, we add a new capability parameter to the
pseries machine type which gives the maximum allowed pagesizes for an
HPT guest. For now we just create and validate the parameter without
making it do anything.
For backwards compatibility, on older machine types we set it to the max
available page size for the host. For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
pseries: Update SLOF firmware image to qemu-slof-20180621
The changes are:
1. fixed broken_sc1;
2. added switching between boot consoles;
3. added PXE boot.
The full list is:
> lib/libnet/pxelinux: Fix two off-by-one bugs in the pxelinux.cfg parser
> lib/libnet/pxelinux: Make the size handling for pxelinux_load_cfg more logical
> libc: Add a simple implementation of an assert() function
> libnet: Support UUID-based pxelinux.cfg file names
> slof: Add a helper function to get the contents of a property in C code
> libnet: Add support for DHCPv4 options 209 and 210
> libnet: Wire up pxelinux.cfg network booting
> libnet: Add functions for downloading and parsing pxelinux.cfg files
> libnet: Put code for determing TFTP error strings into a separate function
> libc: Add the snprintf() function
> libnet: Pass ip_version via struct filename_ip
> resolve ihandle and xt handle in the input command (like for the output)
> Fix output word
> obp-tftp: Make sure to not overwrite paflof in memory
> libnet: Get rid of unused huge_load and block_size parameters
> libc: Check for NULL pointers in free()
> libc: Implement strrchr()
> libnet: Get rid of unnecessary (char *) casts
> broken_sc1: check for H_PRIVILEGE
> OF: Use new property "stdout-path" for boot console
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BALATON Zoltan [Tue, 19 Jun 2018 08:52:15 +0000 (10:52 +0200)]
target/ppc: Add missing opcode for icbt on PPC440
According to PPC440 User Manual PPC440 has multiple opcodes for icbt
instruction: one for compatibility with older cores and two 440
specific opcodes one of which is defined in BookE. QEMU only
implements two of these, add the missing one.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BALATON Zoltan [Tue, 19 Jun 2018 08:52:15 +0000 (10:52 +0200)]
ppc4xx_i2c: Implement directcntl register
As well as being able to generate its own i2c transactions, the ppc4xx
i2c controller has a DIRECTCNTL register which allows explicit control
of the i2c lines.
Using this register an OS can directly bitbang i2c operations. In
order to let emulated i2c devices respond to this, we need to wire up
the DIRECTCNTL register to qemu's bitbanged i2c handling code.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BALATON Zoltan [Tue, 19 Jun 2018 08:52:15 +0000 (10:52 +0200)]
ppc4xx_i2c: Remove unimplemented sdata and intr registers
We don't emulate slave mode so related registers are not needed.
[lh]sadr are only retained to avoid too many warnings and simplify
debugging but sdata is not even correct because device has a 4 byte
FIFO instead so just remove this unimplemented register for now.
The intr register is also not implemented correctly, it is for
diagnostics and normally not even visible on device without explicitly
enabling it. As no guests are known to need this remove it as well.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Sebastian Bauer [Mon, 18 Jun 2018 21:38:16 +0000 (23:38 +0200)]
sm501: Fix hardware cursor color conversion
According to the sm501 specs the hardware cursor colors are to be given in
the rgb565 format, but the code currently interprets them as bgr565.
Therefore, the colors of the hardware cursors are wrong in the QEMU
display, e.g., the standard mouse pointer of AmigaOS appears blue instead
of red. This change fixes this issue by replacing the existing naive
bgr565 => rgb888 conversion with a standard rgb565 => rgb888 one that also
scales the color component values properly.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is described as "the logical OR of all the floating-point exception
bits masked by their respective enable bits". It was not implemented
correctly. The value of FEX would stay on even when all other bits
were set to off.
The VX bit is described as "the logical OR of all of the invalid
operation exceptions". This bit was also not implemented correctly. It
too would stay on when all the other bits were set to off.
My main source of information is an IBM document called:
PowerPC Microprocessor Family:
The Programming Environments for 32-Bit Microprocessors
Page 62 is where the FPSCR information is located.
This is an older copy than the one I use but it is still very useful:
https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
I use a G3 and G5 iMac to compare bit values with QEMU. This patch
fixed all the problems I was having with these bits.
Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
[dwg: Re-wrapped commit message] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 18 Jun 2018 17:34:01 +0000 (19:34 +0200)]
spapr: remove unused spapr_irq routines
spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 18 Jun 2018 17:34:00 +0000 (19:34 +0200)]
spapr: split the IRQ allocation sequence
Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.
This will ease the introduction of a static layout of IRQ numbers.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
KVM HV has a restriction that for HPT mode guests, guest pages must be hpa
contiguous as well as gpa contiguous. We have to account for that in
various places. We determine whether we're subject to this restriction
from the SMMU information exposed by KVM.
Planned cleanups to the way we handle this will require knowing whether
this restriction is in play in wider parts of the code. So, expose a
helper function which returns it.
This does mean some redundant calls to kvm_get_smmu_info(), but they'll go
away again with future cleanups.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
David Gibson [Wed, 28 Mar 2018 03:45:44 +0000 (14:45 +1100)]
spapr: Add cpu_apply hook to capabilities
spapr capabilities have an apply hook to actually activate (or deactivate)
the feature in the system at reset time. However, a number of capabilities
affect the setup of cpus, and need to be applied to each of them -
including hotplugged cpus for extra complication. To make this simpler,
add an optional cpu_apply hook that is called from spapr_cpu_reset().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Previously, the effective values of the various spapr capability flags
were only determined at machine reset time. That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.
But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it. That lets us move the resolution of the
capability defaults much earlier.
This is going to be necessary for some future capabilities.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
David Gibson [Thu, 14 Jun 2018 06:33:58 +0000 (16:33 +1000)]
target/ppc: Allow cpu compatiblity checks based on type, not instance
ppc_check_compat() is used in a number of places to check if a cpu object
supports a certain compatiblity mode, subject to various constraints.
It takes a PowerPCCPU *, however it really only depends on the cpu's class.
We have upcoming cases where it would be useful to make compatibility
checks before we fully instantiate the cpu objects.
ppc_type_check_compat() will now make an equivalent check, but based on a
CPU's QOM typename instead of an instantiated CPU object.
We make use of the new interface in several places in spapr, where we're
essentially making a global check, rather than one specific to a particular
cpu. This avoids some ugly uses of first_cpu to grab a "representative"
instance.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Cédric Le Goater [Mon, 18 Jun 2018 17:05:39 +0000 (19:05 +0200)]
ppc/pnv: introduce Pnv8Chip and Pnv9Chip models
It introduces a base PnvChip class from which the specific processor
chip classes, Pnv8Chip and Pnv9Chip, inherit. Each of them needs to
define an init and a realize routine which will create the controllers
of the target processor. For the moment, the base PnvChip class
handles the XSCOM bus and the cores.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Greg Kurz [Mon, 18 Jun 2018 12:26:49 +0000 (14:26 +0200)]
spapr_cpu_core: migrate VPA related state
QEMU implements the "Shared Processor LPAR" (SPLPAR) option, which allows
the hypervisor to time-slice a physical processor into multiple virtual
processor. The intent is to allow more guests to run, and to optimize
processor utilization.
The guest OS can cede idle VCPUs, so that their processing capacity may
be used by other VCPUs, with the H_CEDE hcall. The guest OS can also
optimize spinlocks, by confering the time-slice of a spinning VCPU to the
spinlock holder if it's currently notrunning, with the H_CONFER hcall.
Both hcalls depend on a "Virtual Processor Area" (VPA) to be registered
by the guest OS, generally during early boot. Other per-VCPU areas can
be registered: the "SLB Shadow Buffer" which allows a more efficient
dispatching of VCPUs, and the "Dispatch Trace Log Buffer" (DTL) which
is used to compute time stolen by the hypervisor. Both DTL and SLB Shadow
areas depend on the VPA to be registered.
The VPA/SLB Shadow/DTL are state that QEMU should migrate, but this doesn't
happen, for no apparent reason other than it was just never coded. This
causes the features listed above to stop working after migration, and it
breaks the logic of the H_REGISTER_VPA hcall in the destination.
The VPA is set at the guest request, ie, we don't have to migrate
it before the guest has actually set it. This patch hence adds an
"spapr_cpu/vpa" subsection to the recently introduced per-CPU machine
data migration stream.
Since DTL and SLB Shadow are optional and both depend on VPA, they get
their own subsections "spapr_cpu/vpa/slb_shadow" and "spapr_cpu/vpa/dtl"
hanging from the "spapr_cpu/vpa" subsection.
Note that this won't break migration to older QEMUs. Is is already handled
by only registering the vmstate handler for per-CPU data with newer machine
types.
Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Greg Kurz [Mon, 18 Jun 2018 12:26:35 +0000 (14:26 +0200)]
spapr_cpu_core: migrate per-CPU data
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.
This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Fri, 15 Jun 2018 15:25:34 +0000 (17:25 +0200)]
ppc/pnv: introduce a new isa_create() operation to the chip model
This moves the details of the ISA bus creation under the LPC model but
more important, the new PnvChip operation will let us choose the chip
class to use when we introduce the different chip classes for Power9
and Power8. It hides away the processor chip controllers from the
machine.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Peter Maydell [Wed, 20 Jun 2018 08:51:30 +0000 (09:51 +0100)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180619' into staging
- cleanup in virtio-ccw
- accommodate guests using vfio-ccw without specifying unlimited
prefetch, but actually working fine
- add cpu model for the z14 Model ZR1
- add support for pxelinux.cfg-style network booting to the s390x
firmware
* remotes/cohuck/tags/s390x-20180619:
pc-bios/s390-ccw: Update the s390-netboot.img binary
pc-bios/s390-ccw: Optimize the s390-netboot.img for size
pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID
pc-bios/s390-ccw/net: Add support for pxelinux-style config files
pc-bios/s390-ccw/net: Update code for the latest changes in SLOF
roms: Update SLOF submodule to current status
pc-bios/s390-ccw: define loadparm length
s390x/cpumodels: add z14 Model ZR1
s390x/ipl: Try to detect Linux vs non Linux for initial IPL PSW
vfio-ccw: remove orb.c64 (64 bit data addresses) check
vfio-ccw: add force unlimited prefetch property
virtio-ccw: clean up notify
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 19 Jun 2018 15:04:43 +0000 (16:04 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Active mirror (blockdev-mirror copy-mode=write-blocking)
- bdrv_drain_*() fixes and test cases
- Fix crash with scsi-hd and drive_del
# gpg: Signature made Mon 18 Jun 2018 17:44:10 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (35 commits)
iotests: Add test for active mirroring
block/mirror: Add copy mode QAPI interface
block/mirror: Add active mirroring
job: Add job_progress_increase_remaining()
block/mirror: Add MirrorBDSOpaque
block/dirty-bitmap: Add bdrv_dirty_iter_next_area
test-hbitmap: Add non-advancing iter_next tests
hbitmap: Add @advance param to hbitmap_iter_next()
block: Generalize should_update_child() rule
block/mirror: Use source as a BdrvChild
block/mirror: Wait for in-flight op conflicts
block/mirror: Use CoQueue to wait on in-flight ops
block/mirror: Convert to coroutines
block/mirror: Pull out mirror_perform()
block: fix QEMU crash with scsi-hd and drive_del
test-bdrv-drain: Test graph changes in drain_all section
block: Allow graph changes in bdrv_drain_all_begin/end sections
block: ignore_bds_parents parameter for drain functions
block: Move bdrv_drain_all_begin() out of coroutine context
block: Allow AIO_WAIT_WHILE with NULL ctx
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 19 Jun 2018 14:19:07 +0000 (15:19 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2018-06-18' into staging
Monitor patches for 2018-06-18
# gpg: Signature made Mon 18 Jun 2018 14:50:29 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-monitor-2018-06-18:
monitor: add lock to protect mon_fdsets
monitor: move init global earlier
monitor: remove event_clock_type
monitor: fix comment for monitor_lock
monitor: more comments on lock-free elements
monitor: protect mon->fds with mon_lock
monitor: rename out_lock to mon_lock
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/kraxel/tags/usb-20180618-pull-request:
Revert "bus: do not unref the added child bus on realize"
Revert "usb: release the created buses"
Revert "usb-ccid: fix bus leak"
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 19 Jun 2018 10:15:27 +0000 (11:15 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.0-20180618' into staging
ppc patch queue 2018-06-18
Next batch of ppc and spapr related patches for the 3.0 release.
* Improved handling of Spectre/Meltdown mitigations for POWER8
* Numerous Mac machine type cleanups and improvements
* Cleanup to cpu realize/unrealize path for spapr
* Create a place for machine-specific per-cpu information, and
start moving some things to it
* Assorted bugfixes
* remotes/dgibson/tags/ppc-for-3.0-20180618: (28 commits)
spapr: fix xics_system_init() error path
target/ppc, spapr: Move VPA information to machine_data
ppc/pnv: introduce a pnv_chip_core_realize() routine
spapr_cpu_core: introduce spapr_create_vcpu()
spapr_cpu_core: add missing rollback on realization path
spapr_cpu_core: fix potential leak in spapr_cpu_core_realize()
spapr_cpu_core: convert last snprintf() to g_strdup_printf()
pnv: Add cpu unrealize path
pnv: Clean up cpu realize path
pnv_core: Allocate cpu thread objects individually
pnv: Fix some error handling cpu realize()
spapr: Clean up cpu realize/unrealize paths
sm501: Do not clear read only bits when writing registers
mos6522: expose mos6522_update_irq() through MOS6522DeviceClass
mos6522: remove additional interrupt flag filter from mos6522_update_irq()
mos6522: only clear the shift register interrupt upon write
xics_kvm: fix a build break
mac_newworld: add PMU device
adb: add property to disable direct reg 3 writes
adb: fix read reg 3 byte ordering
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Kevin Wolf [Mon, 18 Jun 2018 15:20:41 +0000 (17:20 +0200)]
Merge remote-tracking branch 'mreitz/tags/pull-block-2018-06-18' into queue-block
Block patches:
- Active mirror (blockdev-mirror copy-mode=write-blocking)
# gpg: Signature made Mon Jun 18 17:08:19 2018 CEST
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2018-06-18:
iotests: Add test for active mirroring
block/mirror: Add copy mode QAPI interface
block/mirror: Add active mirroring
job: Add job_progress_increase_remaining()
block/mirror: Add MirrorBDSOpaque
block/dirty-bitmap: Add bdrv_dirty_iter_next_area
test-hbitmap: Add non-advancing iter_next tests
hbitmap: Add @advance param to hbitmap_iter_next()
block: Generalize should_update_child() rule
block/mirror: Use source as a BdrvChild
block/mirror: Wait for in-flight op conflicts
block/mirror: Use CoQueue to wait on in-flight ops
block/mirror: Convert to coroutines
block/mirror: Pull out mirror_perform()
Cornelia Huck [Mon, 18 Jun 2018 15:05:27 +0000 (17:05 +0200)]
Merge tag 'tags/s390x-2018-06-18' into staging
Add support for pxelinux.cfg-style network booting to the s390x firmware
# gpg: Signature made Mon 18 Jun 2018 03:59:06 PM CEST
# gpg: using RSA key 2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
* tag 'tags/s390x-2018-06-18':
pc-bios/s390-ccw: Update the s390-netboot.img binary
pc-bios/s390-ccw: Optimize the s390-netboot.img for size
pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID
pc-bios/s390-ccw/net: Add support for pxelinux-style config files
pc-bios/s390-ccw/net: Update code for the latest changes in SLOF
roms: Update SLOF submodule to current status
pc-bios/s390-ccw: define loadparm length
Max Reitz [Wed, 13 Jun 2018 18:18:22 +0000 (20:18 +0200)]
block/mirror: Add copy mode QAPI interface
This patch allows the user to specify whether to use active or only
background mode for mirror block jobs. Currently, this setting will
remain constant for the duration of the entire block job.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180613181823.13618-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:21 +0000 (20:18 +0200)]
block/mirror: Add active mirroring
This patch implements active synchronous mirroring. In active mode, the
passive mechanism will still be in place and is used to copy all
initially dirty clusters off the source disk; but every write request
will write data both to the source and the target disk, so the source
cannot be dirtied faster than data is mirrored to the target. Also,
once the block job has converged (BLOCK_JOB_READY sent), source and
target are guaranteed to stay in sync (unless an error occurs).
Active mode is completely optional and currently disabled at runtime. A
later patch will add a way for users to enable it.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180613181823.13618-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:20 +0000 (20:18 +0200)]
job: Add job_progress_increase_remaining()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180613181823.13618-12-mreitz@redhat.com Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:17 +0000 (20:18 +0200)]
test-hbitmap: Add non-advancing iter_next tests
Add a function that wraps hbitmap_iter_next() and always calls it in
non-advancing mode first, and in advancing mode next. The result should
always be the same.
By using this function everywhere we called hbitmap_iter_next() before,
we should get good test coverage for non-advancing hbitmap_iter_next().
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180613181823.13618-9-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:15 +0000 (20:18 +0200)]
block: Generalize should_update_child() rule
Currently, bdrv_replace_node() refuses to create loops from one BDS to
itself if the BDS to be replaced is the backing node of the BDS to
replace it: Say there is a node A and a node B. Replacing B by A means
making all references to B point to A. If B is a child of A (i.e. A has
a reference to B), that would mean we would have to make this reference
point to A itself -- so we'd create a loop.
bdrv_replace_node() (through should_update_child()) refuses to do so if
B is the backing node of A. There is no reason why we should create
loops if B is not the backing node of A, though. The BDS graph should
never contain loops, so we should always refuse to create them.
If B is a child of A and B is to be replaced by A, we should simply
leave B in place there because it is the most sensible choice.
A more specific argument would be: Putting filter drivers into the BDS
graph is basically the same as appending an overlay to a backing chain.
But the main child BDS of a filter driver is not "backing" but "file",
so restricting the no-loop rule to backing nodes would fail here.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180613181823.13618-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:13 +0000 (20:18 +0200)]
block/mirror: Wait for in-flight op conflicts
This patch makes the mirror code differentiate between simply waiting
for any operation to complete (mirror_wait_for_free_in_flight_slot())
and specifically waiting for all operations touching a certain range of
the virtual disk to complete (mirror_wait_on_conflicts()).
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180613181823.13618-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:12 +0000 (20:18 +0200)]
block/mirror: Use CoQueue to wait on in-flight ops
Attach a CoQueue to each in-flight operation so if we need to wait for
any we can use it to wait instead of just blindly yielding and hoping
for some operation to wake us.
A later patch will use this infrastructure to allow requests accessing
the same area of the virtual disk to specifically wait for each other.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180613181823.13618-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:11 +0000 (20:18 +0200)]
block/mirror: Convert to coroutines
In order to talk to the source BDS (and maybe in the future to the
target BDS as well) directly, we need to convert our existing AIO
requests into coroutine I/O requests.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180613181823.13618-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 13 Jun 2018 18:18:10 +0000 (20:18 +0200)]
block/mirror: Pull out mirror_perform()
When converting mirror's I/O to coroutines, we are going to need a point
where these coroutines are created. mirror_perform() is going to be
that point.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180613181823.13618-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:11 +0000 (11:55 +0800)]
monitor: add lock to protect mon_fdsets
Introduce a new global big lock for mon_fdsets. Take it where needed.
The monitor_fdset_get_fd() handling is a bit tricky: now we need to call
qemu_mutex_unlock() which might pollute errno, so we need to make sure
the correct errno be passed up to the callers. To make things simpler,
we let monitor_fdset_get_fd() return the -errno directly when error
happens, then in qemu_open() we move it back into errno.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-8-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:10 +0000 (11:55 +0800)]
monitor: move init global earlier
Before this patch, monitor fd helpers might be called even earlier than
monitor_init_globals(). This can be problematic.
After previous work, now monitor_init_globals() does not depend on
accelerator initialization any more. Call it earlier (before CLI
parsing; that's where the monitor APIs might be called) to make sure it
is called before any of the monitor APIs.
Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-7-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:09 +0000 (11:55 +0800)]
monitor: remove event_clock_type
Instead, use a dynamic function to detect which clock we'll use. The
problem is that the old code will let monitor initialization depend on
configure_accelerator() (that's where qtest_enabled() start to take
effect). After this change, we don't have such a dependency any more.
We just need to make sure configure_accelerator() is called when we
start to use it. Now it's only used in monitor_qapi_event_queue() and
monitor_qapi_event_handler(), so we're good.
Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-6-peterx@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
[monitor_get_event_clock() name and comment tweaked] Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:08 +0000 (11:55 +0800)]
monitor: fix comment for monitor_lock
Fix typo in d622cb5879c. Meanwhile move these variables close to each
other. monitor_qapi_event_state can be declared static, add that.
Reported-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-5-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:07 +0000 (11:55 +0800)]
monitor: more comments on lock-free elements
Add some explicit comments for both Readline and cpu_set/cpu_get helpers
that they do not need the mon_lock protection.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-4-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:06 +0000 (11:55 +0800)]
monitor: protect mon->fds with mon_lock
mon->fds were protected by BQL. Now protect it by mon_lock so that it
can even be used in monitor iothread.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-3-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Xu [Fri, 8 Jun 2018 03:55:05 +0000 (11:55 +0800)]
monitor: rename out_lock to mon_lock
The out_lock is protecting a few Monitor fields. In the future the
monitor code will start to run in multiple threads. We are going to
turn it into a bigger lock to protect not only the out buffer but also
most of the rest.
Since at it, rearrange the Monitor struct a bit.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-2-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Thomas Huth [Thu, 14 Jun 2018 08:38:22 +0000 (10:38 +0200)]
pc-bios/s390-ccw: Optimize the s390-netboot.img for size
The -O2 optimization flag is passed via CFLAGS to the firmware Makefile,
but in netbook.mak, we've got some rules that only use QEMU_CFLAGS for
compiling the libc and libnet from SLOF, so these files get compiled
without optimization so far. Use CFLAGS here, too, to create faster
and smaller code.
We can additionally save some more bytes in the firmware images by compi-
ling the code with -fno-asynchronous-unwind-tables. This will omit some
ELF sections (used for stack unwinding for example) from the image that
we do not need in the firmware.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Tue, 22 May 2018 09:53:51 +0000 (11:53 +0200)]
pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID
With the STSI instruction, we can get the UUID of the current VM instance,
so we can support loading pxelinux config files via UUID in the file name,
too.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Tue, 22 May 2018 09:37:29 +0000 (11:37 +0200)]
pc-bios/s390-ccw/net: Add support for pxelinux-style config files
Since it is quite cumbersome to manually create a combined kernel with
initrd image for network booting, we now support loading via pxelinux
configuration files, too. In these files, the kernel, initrd and command
line parameters can be specified seperately, and the firmware then takes
care of glueing everything together in memory after the files have been
downloaded. See this URL for details about the config file layout:
https://www.syslinux.org/wiki/index.php?title=PXELINUX
The user can either specify a config file directly as bootfile via DHCP
(but in this case, the file has to start either with "default" or a "#"
comment so we can distinguish it from binary kernels), or a folder (i.e.
the bootfile name must end with "/") where the firmware should look for
the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Fri, 18 May 2018 09:31:27 +0000 (11:31 +0200)]
pc-bios/s390-ccw/net: Update code for the latest changes in SLOF
The ip_version information now has to be stored in the filename_ip_t
structure, and there is now a common function called tftp_get_error_info()
which can be used to get the error string for a TFTP error code.
We can also get rid of some superfluous "(char *)" casts now.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
The problem is that we should avoid making block driver graph
changes while we have in-flight requests. Let's drain all I/O
for this BB before calling bdrv_root_unref_child().
Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Wed, 28 Mar 2018 16:29:18 +0000 (18:29 +0200)]
block: Allow graph changes in bdrv_drain_all_begin/end sections
bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and
did a subtree drain for each of them. This works fine as long as the
graph is static, but sadly, reality looks different.
If the graph changes so that root nodes are added or removed, we would
have to compensate for this. bdrv_next() returns each root node only
once even if it's the root node for multiple BlockBackends or for a
monitor-owned block driver tree, which would only complicate things.
The much easier and more obviously correct way is to fundamentally
change the way the functions work: Iterate over all BlockDriverStates,
no matter who owns them, and drain them individually. Compensation is
only necessary when a new BDS is created inside a drain_all section.
Removal of a BDS doesn't require any action because it's gone afterwards
anyway.
Kevin Wolf [Tue, 29 May 2018 15:17:45 +0000 (17:17 +0200)]
block: ignore_bds_parents parameter for drain functions
In the future, bdrv_drained_all_begin/end() will drain all invidiual
nodes separately rather than whole subtrees. This means that we don't
want to propagate the drain to all parents any more: If the parent is a
BDS, it will already be drained separately. Recursing to all parents is
unnecessary work and would make it an O(n²) operation.
Prepare the drain function for the changed drain_all by adding an
ignore_bds_parents parameter to the internal implementation that
prevents the propagation of the drain to BDS parents. We still (have to)
propagate it to non-BDS parents like BlockBackends or Jobs because those
are not drained separately.
Kevin Wolf [Tue, 10 Apr 2018 14:07:55 +0000 (16:07 +0200)]
block: Move bdrv_drain_all_begin() out of coroutine context
Before we can introduce a single polling loop for all nodes in
bdrv_drain_all_begin(), we must make sure to run it outside of coroutine
context like we already do for bdrv_do_drained_begin().
Kevin Wolf [Tue, 10 Apr 2018 13:51:52 +0000 (15:51 +0200)]
block: Allow AIO_WAIT_WHILE with NULL ctx
bdrv_drain_all() wants to have a single polling loop for draining the
in-flight requests of all nodes. This means that the AIO_WAIT_WHILE()
condition relies on activity in multiple AioContexts, which is polled
from the mainloop context. We must therefore call AIO_WAIT_WHILE() from
the mainloop thread and use the AioWait notification mechanism.
Just randomly picking the AioContext of any non-mainloop thread would
work, but instead of bothering to find such a context in the caller, we
can just as well accept NULL for ctx.
Kevin Wolf [Fri, 23 Mar 2018 19:29:24 +0000 (20:29 +0100)]
block: Defer .bdrv_drain_begin callback to polling phase
We cannot allow aio_poll() in bdrv_drain_invoke(begin=true) until we're
done with propagating the drain through the graph and are doing the
single final BDRV_POLL_WHILE().
Just schedule the coroutine with the callback and increase bs->in_flight
to make sure that the polling phase will wait for it.
Kevin Wolf [Fri, 23 Mar 2018 14:57:20 +0000 (15:57 +0100)]
block: Don't poll in parent drain callbacks
bdrv_do_drained_begin() is only safe if we have a single
BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow
that parent callbacks introduce a nested polling loop that could cause
graph changes while we're traversing the graph.
Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single
node without waiting for its requests to complete. These requests will
be waited for in the BDRV_POLL_WHILE() call down the call chain.
Kevin Wolf [Fri, 23 Mar 2018 11:40:21 +0000 (12:40 +0100)]
test-bdrv-drain: Test node deletion in subtree recursion
If bdrv_do_drained_begin() polls during its subtree recursion, the graph
can change and mess up the bs->children iteration. Test that this
doesn't happen.
Kevin Wolf [Fri, 23 Mar 2018 11:40:41 +0000 (12:40 +0100)]
block: Drain recursively with a single BDRV_POLL_WHILE()
Anything can happen inside BDRV_POLL_WHILE(), including graph
changes that may interfere with its callers (e.g. child list iteration
in recursive callers of bdrv_do_drained_begin).
Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the
end of bdrv_do_drained_begin() to avoid such effects. The recursion
happens now inside the loop condition. As the graph can only change
between bdrv_drain_poll() calls, but not inside of it, doing the
recursion here is safe.
Max Reitz [Wed, 28 Feb 2018 18:04:54 +0000 (19:04 +0100)]
test-bdrv-drain: Add test for node deletion
This patch adds two bdrv-drain tests for what happens if some BDS goes
away during the drainage.
The basic idea is that you have a parent BDS with some child nodes.
Then, you drain one of the children. Because of that, the party who
actually owns the parent decides to (A) delete it, or (B) detach all its
children from it -- both while the child is still being drained.
A real-world case where this can happen is the mirror block job, which
may exit if you drain one of its children.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Thu, 22 Mar 2018 13:35:58 +0000 (14:35 +0100)]
block: Remove bdrv_drain_recurse()
For bdrv_drain(), recursively waiting for child node requests is
pointless because we didn't quiesce their parents, so new requests could
come in anyway. Letting the function work only on a single node makes it
more consistent.
For subtree drains and drain_all, we already have the recursion in
bdrv_do_drained_begin(), so the extra recursion doesn't add anything
either.
Remove the useless code.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Thu, 22 Mar 2018 13:11:20 +0000 (14:11 +0100)]
block: Really pause block jobs on drain
We already requested that block jobs be paused in .bdrv_drained_begin,
but no guarantee was made that the job was actually inactive at the
point where bdrv_drained_begin() returned.
This introduces a new callback BdrvChildRole.bdrv_drained_poll() and
uses it to make bdrv_drain_poll() consider block jobs using the node to
be drained.
For the test case to work as expected, we have to switch from
block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even
considered active and must be waited for when draining the node.
Kevin Wolf [Thu, 22 Mar 2018 09:57:14 +0000 (10:57 +0100)]
block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()
Commit 91af091f923 added an additional aio_poll() to BDRV_POLL_WHILE()
in order to make sure that all pending BHs are executed on drain. This
was the wrong place to make the fix, as it is useless overhead for all
other users of the macro and unnecessarily complicates the mechanism.
This patch effectively reverts said commit (the context has changed a
bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the
loop condition for drain.
The effect is probably hard to measure in any real-world use case
because actual I/O will dominate, but if I run only the initialisation
part of 'qemu-img convert' where it calls bdrv_block_status() for the
whole image to find out how much data there is copy, this phase actually
needs only roughly half the time after this patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Wed, 4 Apr 2018 11:26:16 +0000 (13:26 +0200)]
tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now
Since we use bdrv_do_drained_begin/end() for bdrv_drain_all_begin/end(),
coroutine context is automatically left with a BH, preventing the
deadlocks that made bdrv_drain_all*() unsafe in coroutine context. Now
that we even removed the old polling code as dead code, it's obvious
that it's compatible now.
Enable the coroutine test cases for bdrv_drain_all().
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Thu, 14 Dec 2017 10:25:16 +0000 (11:25 +0100)]
block: Don't manually poll in bdrv_drain_all()
All involved nodes are already idle, we called bdrv_do_drain_begin() on
them.
The comment in the code suggested that this was not correct because the
completion of a request on one node could spawn a new request on a
different node (which might have been drained before, so we wouldn't
drain the new request). In reality, new requests to different nodes
aren't spawned out of nothing, but only in the context of a parent
request, and they aren't submitted to random nodes, but only to child
nodes. As long as we still poll for the completion of the parent request
(which we do), draining each root node separately is good enough.
Remove the additional polling code from bdrv_drain_all_begin() and
replace it with an assertion that all nodes are already idle after we
drained them separately.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Thu, 14 Dec 2017 09:27:23 +0000 (10:27 +0100)]
block: Use bdrv_do_drain_begin/end in bdrv_drain_all()
bdrv_do_drain_begin/end() implement already everything that
bdrv_drain_all_begin/end() need and currently still do manually: Disable
external events, call parent drain callbacks, call block driver
callbacks.
It also does two more things:
The first is incrementing bs->quiesce_counter. bdrv_drain_all() already
stood out in the test case by behaving different from the other drain
variants. Adding this is not only safe, but in fact a bug fix.
The second is calling bdrv_drain_recurse(). We already do that later in
the same function in a loop, so basically doing an early first iteration
doesn't hurt.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Sat, 9 Dec 2017 23:11:13 +0000 (00:11 +0100)]
test-bdrv-drain: bdrv_drain() works with cross-AioContext events
As long as nobody keeps the other I/O thread from working, there is no
reason why bdrv_drain() wouldn't work with cross-AioContext events. The
key is that the root request we're waiting for is in the AioContext
we're polling (which it always is for bdrv_drain()) so that aio_poll()
is woken up in the end.
Add a test case that shows that it works. Remove the comment in
bdrv_drain() that claims otherwise.
liujunjie [Thu, 7 Jun 2018 08:02:37 +0000 (16:02 +0800)]
ps2: check PS2Queue wptr pointer in post_load routine
In commit 802cbcb7300, most issues have been fixed when qemu guest
migration. But the queue size still need to check whether is equal to
PS2_QUEUE_SIZE. If yes, the wptr should set as 0. Or, wptr would larger
than PS2_QUEUE_SIZE and never come back when ps2_queue_noirq is called.
This could lead to OOB access, add check to avoid it.
Gerd Hoffmann [Wed, 13 Jun 2018 12:29:45 +0000 (14:29 +0200)]
hw/display: add ramfb, a simple boot framebuffer living in guest ram
The boot framebuffer is expected to be configured by the firmware, so it
uses fw_cfg as interface. Initialization goes as follows:
(1) Check whenever etc/ramfb is present.
(2) Allocate framebuffer from RAM.
(3) Fill struct RAMFBCfg, write it to etc/ramfb.
Done. You can write stuff to the framebuffer now, and it should appear
automagically on the screen.
Note that this isn't very efficient because it does a full display
update on each refresh. No dirty tracking. Dirty tracking would have
to be active for the whole ram slot, so that wouldn't be very efficient
either. For a boot display which is active for a short time only this
isn't a big deal. As permanent guest display something better should be
used (if possible).
This is the ramfb core code. Some windup is needed for display devices
which want have a ramfb boot display.
s390x/ipl: Try to detect Linux vs non Linux for initial IPL PSW
Right now the IPL device always starts from address 0x10000 (the usual
Linux entry point). To run other guests (e.g. test programs) it is
useful to use the IPL PSW from address 0. We can use the Linux magic
at 0x10008 to decide.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180612125933.262679-1-borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Halil Pasic [Thu, 24 May 2018 17:58:28 +0000 (19:58 +0200)]
vfio-ccw: remove orb.c64 (64 bit data addresses) check
The vfio-ccw module does the check too, and there is actually no
technical obstacle for supporting fmt 1 idaws. Let us be ready for the
beautiful day when fmt 1 idaws become supported by the vfio-ccw kernel
module. QEMU does not have to do a thing for that, except not insisting
on this check.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Jason J. Herne <jjherne@linux.ibm.com> Tested-by: Jason J. Herne <jjherne@linux.ibm.com>
Message-Id: <20180524175828.3143-3-pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Halil Pasic [Thu, 24 May 2018 17:58:27 +0000 (19:58 +0200)]
vfio-ccw: add force unlimited prefetch property
There is at least one guest (OS) such that although it does not rely on
the guarantees provided by ORB 1 word 9 bit (aka unlimited prefetch, aka
P bit) not being set, it fails to tell this to the machine.
Usually this ain't a big deal, as the original purpose of the P bit is to
allow for performance optimizations. vfio-ccw however can not provide the
guarantees required if the bit is not set.
It is not possible to implement support for the P bit not set without
transitioning to lower level protocols for vfio-ccw. So let's give the
user the opportunity to force setting the P bit, if the user knows this
is safe. For self modifying channel programs forcing the P bit is not
safe. If the P bit is forced for a self modifying channel program things
are expected to break in strange ways.
Let's also avoid warning multiple about P bit not set in the ORB in case
P bit is not told to be forced, and designate the affected vfio-ccw
device.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Suggested-by: Dong Jia Shi <bjsdjshi@linux.ibm.com> Acked-by: Jason J. Herne <jjherne@linux.ibm.com> Tested-by: Jason J. Herne <jjherne@linux.ibm.com>
Message-Id: <20180524175828.3143-2-pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Revert "bus: do not unref the added child bus on realize"
This is wrong. object_finalize_child_property()'s unref balances the
ref in object_property_add_child(). qbus_realize's unref balances the
ref that was initially placed by object_new/object_initialize.
Greg Kurz [Fri, 15 Jun 2018 16:58:00 +0000 (18:58 +0200)]
spapr: fix xics_system_init() error path
Commit 3d85885a1b1f3 tried to fix error handling, but it actually
went into the wrong direction by dropping the local Error *.
In the default KVM case, the rationale is to try the in-kernel XICS first,
and if not possible, to fallback to userland XICS. Passing errp everywhere
makes this fallback impossible if errp is &error_fatal (which happens to
be the case). And anyway, if the caller would pass a regular &local_err,
things would be worse: we could possibly pass an already set *errp to
error_setg() and crash, or return an error even in case of success.
So we definitely need a local Error * and only propagate it when we're
done with the fallback logic. This is what this patch does.
Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Mark Cave-Ayland [Fri, 15 Jun 2018 07:33:03 +0000 (08:33 +0100)]
SPARC64: add icount support
This patch adds gen_io_start()/gen_io_end() to various instructions as required
in order to boot my OpenBIOS test images on qemu-system-sparc64 with icount
enabled.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fix the issues by converting the instance_init functions into realize()
functions instead, which are allowed to fail (and not called during
device introspection).
Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>