Kevin O'Connor [Mon, 1 Apr 2024 01:58:12 +0000 (21:58 -0400)]
stdvga: Rename CGA palette functions
Rename stdvga_set_border_color() to stdvga_set_cga_background_color()
and stdvga_set_palette() to stdvga_set_cga_palette(). These functions
implement compatibility for old CGA cards - rename them so they are
not confused with the functions that manipulte the VGA palette.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Kevin O'Connor [Fri, 15 Mar 2024 14:58:57 +0000 (10:58 -0400)]
vgasrc: Use curmode_g instead of vmode_g when mode is the current video mode
Many functions are passed a pointer to the current video mode
vgamode_s struct. Use the name 'curmode_g' for these functions and
use 'vmode_g' for functions that can accept an arbitrary video mode.
Hopefully this will make the goals of the functions more clear.
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Daniel Verkamp [Tue, 12 Mar 2024 03:26:18 +0000 (20:26 -0700)]
vbe: implement function 09h (get/set palette data)
Since the VBE mode attributes indicate that all modes are not VGA
compatible, applications must use VBE function 09h to manipulate the
palette rather than directly accessing the VGA registers.
This implementation uses the standard VGA registers for all hardware,
which may not be appropriate; I only verified qemu -device VGA.
Without this patch, the get/set palette function returns an error code,
so programs that use 8-bit indexed color modes fail. For example, Quake
(DOS) printed "Error: Unable to load VESA palette" and exited when
trying to set a SVGA mode like 640x480, but with the patch it succeeds.
This fixes qemu issue #251 and #1862.
Daniel Verkamp [Tue, 12 Mar 2024 00:56:40 +0000 (17:56 -0700)]
vgasrc: round up save/restore size
When calculating the size of the buffer required for the VGA/VBE state,
round up rather than truncating when dividing the number of bytes to get
the number of 64-byte blocks. Without this modification, the save state
function will write past the end of a buffer of the size requested.
Daniel Verkamp [Thu, 7 Mar 2024 09:08:27 +0000 (01:08 -0800)]
vbe: Add VBE 2.0+ OemData field to struct vbe_info
Per the VBE 2.0 specification, the VBE controller information is 512
bytes long when the "VBE2" signature is provided, instead of the
original 256 bytes.
src/bootsplash.c uses the original pre-VBE-2.0 256-byte structure while
also filling in the "VBE2" signature, so a video BIOS that makes use of
the VBE2 OemData area could write past the end of the allocated region.
The original bootsplash code did not have this bug; it was introduced
when the bootsplash VBE structures were merged with the VGA ROM struct
definitions.
Fixes: 69e941c159ed ("Merge bootsplash and VGA ROM vbe structure definitions") Signed-off-by: Daniel Verkamp <daniel@drv.nu>
Igor Mammedov [Fri, 23 Feb 2024 15:05:22 +0000 (16:05 +0100)]
fix smbios blob length overflow
When tables are more than 64K, size of copied tables will be
truncated due to cast from u32 to u16, and as result only
a small portion of the tables will be copied in the end.
That leads to corrupted tables (a part from QEMU and
remainder is whatever was in memory block allocated for
the tables).
Fix it by making qtables_len 32bit int.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Max Tottenham [Thu, 25 Jan 2024 15:00:50 +0000 (10:00 -0500)]
Add LBA 64bit support for reads beyond 2TB.
When booting from a >2TB drive/filesystem, it's possible what the
kernel/bootloader may be updated and written out at an LBA address
beyond what is normally accessible by the READ(10) SCSI commands.
If this happens to the kernel grub will fail to boot the kernel
as it will call into the BIOS with an LBA address >2TB, and the
BIOS will return an error. Per the SCSI spec, >2TB drives should
return 0XFFFFFFFF, and a READ CAPACITY(16) command should be issued
to determine the full size of the drive, READ(16) commands can then
be used in order to read data at LBA addresses beyond 2TB (64 bit
LBA addresses)
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Message-ID: <20240125150050.3775834-2-mtottenh@akamai.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
esp-scsi: terminate DMA transfer when ESP data transfer completes
When the ESP data transfer completes indicated by the STAT_TC flag being set,
terminate the DMA transfer by issuing a DMA IDLE command. Otherwise in the case
where the guest sends a reset followed by an ESP command, the DMA signal remains
enabled and so the next SeaBIOS DMA transfer begins immediately when the next
ESP command is received rather than waiting until the data is ready and the DMA
command is issued.
With this fix it is possible to boot a Windows XP ISO to the installer and
complete a full installation within QEMU directly using SeaBIOS.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240101121942.383191-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Current seabios code will only enable and use the 64bit pci io window in
case it runs out of space in the 32bit pci mmio window below 4G.
This patch will also enable the 64bit pci io window when
(a) RAM above 4G is present, and
(b) the physical address space size is known, and
(c) seabios is running on a 64bit capable processor.
This operates with the assumption that guests which are ok with memory
above 4G most likely can handle mmio above 4G too.
In case the 64bit pci io window is enabled also assign more memory to
prefetchable pci bridge windows and the complete 64bit pci io window.
The total mmio window size is 1/8 of the physical address space.
Minimum bridge windows size is 1/256 of the total mmio window size.
Gerd Hoffmann [Wed, 31 Aug 2022 06:27:33 +0000 (08:27 +0200)]
detect physical address space size
Check for pae and long mode using cpuid. If present also read the
physical address bits. Apply some qemu sanity checks (see below).
Record results in PhysBits and LongMode variables. In case we are not
sure what the address space size is leave the PhysBits variable unset.
On qemu we have the problem that for historical reasons x86_64
processors advertise 40 physical address space bits by default, even in
case the host supports less than that so actually using the whole
address space will not work.
Because of that the code applies some extra sanity checks in case we
find 40 (or less) physical address space bits advertised. Only
known-good values (which is 40 for amd processors and 36+39 for intel
processors) will be accepted as valid.
Recommendation is to use 'qemu -cpu ${name},host-phys-bits=on' to
advertise valid physical address space bits to the guest. Some distro
builds enable this by default, and most likely the qemu default will
change in near future too.
In case kvm emulates features of another hypervisor (for example hyperv)
two VMM CPUID blocks will be present, one for the emulated hypervisor
and one for kvm itself.
This patch makes seabios loop over the VMM CPUID blocks to make sure it
will properly detect kvm when multiple blocks are present.
esp-scsi: handle non-DMA SCSI commands with no data phase
The existing esp-scsi state machine checks for the STAT_TC bit to exit state 1
but in the case where there is no data phase, a non-DMA command is executed
which doesn't set STAT_TC. This only works because QEMU currently always sets
STAT_TC just after issuing every SCSI command.
Update the esp-scsi state machine so that in the case where there is no data
phase, we immediately execute CMD_ICCS instead of waiting for STAT_TC to be
set which will never happen with a non-DMA CMD_SELATN command.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20230807065300.366070-4-mark.cave-ayland@ilande.co.uk> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
esp-scsi: check for INTR_BS/INTR_FC instead of STAT_TC for command completion
The ESP SELATN command used to send SCSI commands from the ESP to the SCSI bus
is not a DMA command and therefore does not affect the STAT_TC bit. The only
reason this works at all is due to a bug in QEMU which (currently) always
updates the STAT_TC bit in ESP_RSTAT regardless of the state of the ESP_CMD_DMA
bit.
According to the NCR datasheet [1] the INTR_BS/INTR_FC bits are set when the
SELATN command has completed, so update the existing logic to check for these
bits in ESP_RINTR instead. Note that the read of ESP_RINTR needs to be
restricted to state == 0 as reading ESP_RINTR resets the ESP_RSTAT register
which breaks the STAT_TC check when state == 1.
This commit also includes an extra read of ESP_INTR to clear all the interrupt
bits before submitting the SELATN command to ensure that we don't accidentally
immediately progress to the data phase handling logic where ESP_RINTR bits have
already been set by a previous ESP command.
The ESP FIFO is used as a buffer for DMA requests and so isn't guaranteed to
be empty in the case of SCSI errors or a mixed DMA/non-DMA request. Flush the
FIFO before sending a SCSI command to guarantee that it is correctly
positioned at the start of the FIFO.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230807065300.366070-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
For platforms with high number of numa nodes, 32 e820 entries are not
enough. Linux kernel sets the maximum e820 entries to a base value of
128. Setting BUILD_MAX_E820 to 128 to be in sync with this base value.
Signed-off-by: Tony Titus <tonydt@amazon.com>
Message-ID: <20230728044148.58041-1-tonydt@amazon.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
According to AHCI 1.3.1, 5.3.8.1 RegFIS:Entry, if ERR_STAT is set in the
received FIS, the HBA shall jump to state ERR:FatalTaskfile, which will
raise a TFES IRQ.
This means that if ERR_STAT is set in the recevied FIS, PxIS.TFES will
be set, without either PxIS.DHRS or PxIS.PSS being set.
SeaBIOS function ahci_port_setup() will try to identify an AHCI device
by sending an ATAPI identify device command. However, such a command
will be aborted with ERR_STAT set for a regular (non-ATAPI) device.
ahci_command() already performs the correct error recovery steps when
status is correctly set, so simply modify ahci_command() to read the
correct status when PxIS.TFES is set.
It is safe to read PxTFD when PxIS.TFES is set, even for systems with a
port multiplier, see AHCI 1.3.1, 9.3.7 PxTFD Register Information:
"When a taskfile error occurs (PxIS.TFES is set to '1'), the host may
refer to the values in PxTFD. The values in PxTFD at this time are
guaranteed to correspond to the device that reported the taskfile error
condition."
Without this, each boot will be delayed by 32 seconds, waiting for the
AHCI command to timeout.
virtio-blk: Fix integer overflow for large max IO sizes
When the maximum IO size supported by the virtio-blk backend is large
enough (>= 32MiB for 512B sectors), the computed blk_num_max will
overflow. In particular, if it's a multiple of 32MiB, blk_num_max
will end up as zero, causing IO requests to fail.
This is triggered by e.g. the SPDK virtio-blk vhost-user backend.
To fix it, just limit blk_num_max to 65535 before converting to u16.
José Martínez [Tue, 13 Jun 2023 15:01:34 +0000 (11:01 -0400)]
Fix high memory zone initialization in CSM mode
malloc_high() cannot allocate any memory in CSM mode due to an empty
ZoneHigh. SeaBIOS cannot find any disk to boot from because device
initialization fails.
The bug was introduced in 1.16.1 (commit dc88f9b) when the meaning of
BUILD_MAX_HIGHTABLE changed but CSM code was not updated. This patch
reverts to the previous behavior by using BUILD_MIN_HIGHTABLE in CSM
methods.
David Woodhouse [Fri, 20 Jan 2023 11:33:19 +0000 (11:33 +0000)]
xen: require Xen info structure at 0x1000 to detect Xen
When running under Xen, hvmloader places a table at 0x1000 with the e820
information and BIOS tables. If this isn't present, SeaBIOS will
currently panic.
We now have support for running Xen guests natively in QEMU/KVM, which
boots SeaBIOS directly instead of via hvmloader, and does not provide
the same structure.
As it happens, this doesn't matter on first boot. because although we
set PlatformRunningOn to PF_QEMU|PF_XEN, reading it back again still
gives zero. Presumably because in true Xen, this is all already RAM. But
in QEMU with a faithfully-emulated PAM config in the host bridge, it's
still in ROM mode at this point so we don't see what we've just written.
On reboot, however, the region *is* set to RAM mode and we do see the
updated value of PlatformRunningOn, do manage to remember that we've
detected Xen in CPUID, and hit the panic.
It's not trivial to detect QEMU vs. real Xen at the time xen_preinit()
runs, because it's so early. We can't even make a XENVER_extraversion
hypercall to look for hints, because we haven't set up the hypercall
page (and don't have an allocator to give us a page in which to do so).
So just make Xen detection contingent on the info structure being
present. If it wasn't, we were going to panic anyway. That leaves us
taking the standard QEMU init path for Xen guests in native QEMU,
which is just fine.
Untested on actual Xen but ObviouslyCorrect™.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Qi Zhou [Mon, 14 Nov 2022 12:55:44 +0000 (20:55 +0800)]
usb: fix wrong init of keyboard/mouse's if first interface is not boot protocol
There is always some endpoint descriptors after each interface descriptor, We
should only decrement num_iface if interface type is USB_DT_INTERFACE, see
https://www.beyondlogic.org/usbnutshell/usb5.shtml#ConfigurationDescriptors
Xuan Zhuo [Mon, 14 Nov 2022 03:58:18 +0000 (11:58 +0800)]
virtio: finalize features before using device
Under the standard of Virtio 1.0, the initialization process of the
device must first write sub-features back to device before
using device, such as finding vqs.
There are four places using vp_find_vq().
1. virtio-blk.pci: put the code of finalizing features in front of using device
2. virtio-blk.mmio: put the code of finalizing features in front of using device
3. virtio-scsi.pci: is ok
4. virtio-scsi.mmio: add the code of finalizing features before vp_find_vq()
Xuan Zhuo [Mon, 14 Nov 2022 03:58:17 +0000 (11:58 +0800)]
virtio-mmio: read/write the hi 32 features for mmio
Under mmio, when we read the feature from the device, we should read the
high 32-bit part. Similarly, when writing the feature back, we should
also write back the high 32-bit feature.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221114035818.109511-2-xuanzhuo@linux.alibaba.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Igor Mammedov [Fri, 18 Nov 2022 14:27:55 +0000 (15:27 +0100)]
acpi: parse Alias object
Since QEMU commit 47a373faa6 (acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML)
SeaBIOS fails to parse ISA bridge AML with:
parse_termlist: parse error, skip from 92/517
...
ACPI: no PS/2 keyboard present
due to Alias term in DSDT which isn't handled by SeaBIOS properly.
Add dumb Alias parsing which just skips over term,
so the rest of AML could be parsed successfully.
Fixes: a05af290bac5 ("virtio-blk: split large IO according to size_max") Acked-by: Andy Pei <andy.pei@intel.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Xiaofei Lee <hbuxiaofei@gmail.com>
Gerd Hoffmann [Thu, 30 Jun 2022 15:28:40 +0000 (17:28 +0200)]
virtio-blk: use larger default request size
Bump default from 8 to 64 blocks. Using 8 by default leads
to requests being splitted on qemu, which slows down boot.
Some (temporary) debug logging added showed that almost all
requests on a standard fedora install are less than 64 blocks,
so that should bring us back to 1.15 performance levels.
Use the variable highram_size instead of the BUILD_MAX_HIGHTABLE #define
for the ZoneHigh size. Initialize the new variable with the old #define,
so behavior does not change.
This allows to easily adjust the ZoneHigh size at runtime in a followup
patch.
After a reset of a QEMU -machine q35 guest, the PCI Express
Enhanced Configuration Mechanism is disabled and the variable
mmconfig no longer matches the configuration register PCIEXBAR
of the Q35 chipset. Until the variable mmconfig is reset to 0,
all pci_config_*() functions no longer work.
The variable mmconfig is located in one of the read-only C-F
segments. To reset it the pci_config_*() functions are needed,
but they do not work.
Replace all pci_config_*() calls with Standard PCI Configuration
Mechanism pci_ioconfig_*() calls until mmconfig is overwritten
with 0 by a fresh copy of the BIOS.
This fixes
In resume (status=0)
In 32bit resume
Attempting a hard reboot
Unable to unlock ram - bridge not found
Split out the Standard PCI Configuration Access Mechanism
pci_ioconfig_*() functions from the pci_config_*() functions.
The standard PCI CAM functions will be used in the next patch.
Florian Larysch [Sun, 23 Jan 2022 16:43:57 +0000 (17:43 +0100)]
nvme: fix LBA format data structure
The LBA Format Data structure is dword-sized, but struct nvme_lba_format
erroneously contains an additional member, misaligning all LBAF
descriptors after the first and causing them to be misinterpreted.
Remove it.
Signed-off-by: Florian Larysch <fl@n621.de> Reviewed-by: Alexander Graf <graf@amazon.com>
nvme: avoid use-after-free in nvme_controller_enable()
Commit b68f313c9139 ("nvme: Record maximum allowed request size")
introduced a use of "identify" past it being passed to free(). Latch the
value of interest into a local variable.
Reported-by: Coverity (ID 1497613) Signed-off-by: Jan Beulich <jbeulich@suse.com>
Kevin O'Connor [Thu, 20 Jan 2022 00:07:47 +0000 (19:07 -0500)]
sercon: Fix missing GET_LOW() to access rx_bytes
The variable rx_bytes is marked VARLOW, but there was a missing
GET_LOW() to access rx_bytes. Fix by copying rx_bytes to a local
variable and avoid the repetitive segment memory accesses.
Reported-by: Gabe Black <gabe.black@gmail.com> Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Kevin O'Connor [Wed, 19 Jan 2022 18:20:21 +0000 (13:20 -0500)]
nvme: Build the page list in the existing dma buffer
Commit 01f2736cc905d ("nvme: Pass large I/O requests as PRP lists")
introduced multi-page requests using the NVMe PRP mechanism. To store the
list and "first page to write to" hints, it added fields to the NVMe
namespace struct.
Unfortunately, that struct resides in fseg which is read-only at runtime.
While KVM ignores the read-only part and allows writes, real hardware and
TCG adhere to the semantics and ignore writes to the fseg region. The net
effect of that is that reads and writes were always happening on address 0,
unless they went through the bounce buffer logic.
This patch builds the PRP maintenance data in the existing "dma bounce
buffer" and only builds it when needed.
Fixes: 01f2736cc905d ("nvme: Pass large I/O requests as PRP lists") Reported-by: Matt DeVillier <matt.devillier@gmail.com> Signed-off-by: Alexander Graf <graf@amazon.com> Signed-off-by: Kevin O'Connor <kevin@koconnor.net> Reviewed-by: Alexander Graf <graf@amazon.com>
Gerd Hoffmann [Thu, 16 Dec 2021 07:20:58 +0000 (08:20 +0100)]
svgamodes: add standard 4k modes
Add all three 4k modes. Computer monitors typically use
the first one (3840x2160).
Add 16 and 32 bpp variants. 24bpp is dead these days, and
software which is so old that still uses those modes most
likely doesn't even know what 4k is.
Igor Mammedov [Mon, 29 Nov 2021 11:48:12 +0000 (06:48 -0500)]
pci: let firmware reserve IO for pcie-pci-bridge
With [1] patch hotplug of rtl8139 succeeds, with caveat that it
fails to initialize IO bar, which is caused by [2] that makes
firmware skip IO reservation for any PCIe device, which isn't
correct in case of pcie-pci-bridge.
Fix it by exposing hotplug type and making IO resource optional
only if PCIe hotplug is in use.
[1]
"pci: reserve resources for pcie-pci-bridge to fix regressed hotplug on q35"
[2] Fixes: 76327b9f32a ("fw/pci: do not automatically allocate IO region for PCIe bridges") Signed-off-by: Igor Mammedov imammedo@redhat.com Tested-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> CC: mapfelba@redhat.com CC: kraxel@redhat.com CC: mst@redhat.com CC: lvivier@redhat.com CC: jusual@redhat.com
Igor Mammedov [Mon, 29 Nov 2021 11:48:11 +0000 (06:48 -0500)]
pci: reserve resources for pcie-pci-bridge to fix regressed hotplug on q35
If QEMU is started with unpopulated pcie-pci-bridge with ACPI PCI
hotplug enabled (default since QEMU-6.1), hotplugging a PCI device
into one of the bridge slots fails due to lack of resources.
once linux guest is booted (test used Fedora 34), hotplug NIC from
QEMU monitor:
(qemu) device_add rtl8139,bus=pcie-pci-bridge-0,addr=0x2
guest fails hotplug with:
pci 0000:01:02.0: [10ec:8139] type 00 class 0x020000
pci 0000:01:02.0: reg 0x10: [io 0x0000-0x00ff]
pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x000000ff]
pci 0000:01:02.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
pci 0000:01:02.0: BAR 6: no space for [mem size 0x00040000 pref]
pci 0000:01:02.0: BAR 6: failed to assign [mem size 0x00040000 pref]
pci 0000:01:02.0: BAR 0: no space for [io size 0x0100]
pci 0000:01:02.0: BAR 0: failed to assign [io size 0x0100]
pci 0000:01:02.0: BAR 1: no space for [mem size 0x00000100]
pci 0000:01:02.0: BAR 1: failed to assign [mem size 0x00000100]
8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
PCI Interrupt Link [GSIG] enabled at IRQ 22
8139cp 0000:01:02.0: no MMIO resource
8139cp: probe of 0000:01:02.0 failed with error -5
Reason for this is that commit [1] didn't take into account
pcie-pci-bridge, marking bridge as non hotpluggable instead of
handling it as possibly SHPC capable bridge.
Fix issue by checking if pcie-pci-bridge is SHPC capable and
if it is mark it as hotpluggable.
Fixes regression in QEMU-6.1 and later, since it was switched
to ACPI based PCI hotplug on Q35 by default at that time.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2001732
[1] Fixes: 3aa31d7d637 ("hw/pci: reserve IO and mem for pci express downstream ports with no devices attached") Signed-off-by: Igor Mammedov imammedo@redhat.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> CC: mapfelba@redhat.com CC: kraxel@redhat.com CC: mst@redhat.com CC: lvivier@redhat.com CC: jusual@redhat.com
Eduardo Habkost [Thu, 10 Dec 2020 19:07:16 +0000 (14:07 -0500)]
smbios: Make smbios_build_tables() ready for 64-bit tables
Make smbios_build_tables() get u64 address and u32 length
arguments, making it usable for SMBIOS 3.0. Adapt
smbios_21_setup_entry_point() to use intermediate variables when
calling smbios_build_tables().
Eduardo Habkost [Thu, 10 Dec 2020 18:10:15 +0000 (13:10 -0500)]
smbios: Make smbios_build_tables() more generic
Instead of taking a SMBIOS 2.1 entry point as argument, make
smbios_build_tables() take pointers to the fields it actually
changes. This will allow us to reuse the function for SMBIOS 3.0
later.
Eduardo Habkost [Thu, 10 Dec 2020 18:05:17 +0000 (13:05 -0500)]
smbios: Extract SMBIOS table building code to separate function
Move the code that builds the SMBIOS tables to a separate
smbios_build_tables() function, to keep it isolated from the code
that initializes the SMBIOS entry point.
Thew new function will still take a smbios_21_entry_point
argument to make code review easier, but this will be changed by
the next commits.
Eduardo Habkost [Thu, 10 Dec 2020 17:32:37 +0000 (12:32 -0500)]
smbios: Use smbios_next() at smbios_romfile_setup()
Use smbios_next() instead of smbios_21_next(), to make the code
more generic and reusable for SMBIOS 3.0 support.
Note that `qtables_len` is initialized to `ftables->size` instead
of `ep.structure_table_length` now, but both fields are
guaranteed to have exactly the same value.
Eduardo Habkost [Thu, 10 Dec 2020 20:18:28 +0000 (15:18 -0500)]
tpm: Use smbios_get_tables()
Instead of using the SMBios21Addr global variable, use the
smbios_get_tables() helper. This doesn't change any behavior
yet, but it will be useful when we start supporting SMBIOS 3.0
entry points.
Stefan Berger [Mon, 14 Jun 2021 17:35:49 +0000 (13:35 -0400)]
tcgbios: Use The proper sha function for each PCR bank
Instead of just using sha1 for all PCR banks (and truncating
the value or zero-padding it) use the proper hash function for
each one of the banks. For unimplemented hashes, fill the buffer
with 0xff.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Volker Rümelin [Fri, 4 Jun 2021 18:01:20 +0000 (20:01 +0200)]
stacks: call check_irqs() after switch_next()
In function run_thread() the function check_irqs() gets called
after the thread switch for atomic handoff reasons. In yield()
it's the other way round.
If check_irqs() is called after run_thread() and check_irqs()
is called before switch_next() in yield(), it can happen in a
constructed case that a background thread runs twice without
a check_irqs() call in between. Call check_irqs() after
switch_next() in yield() to prevent this.
Volker Rümelin [Fri, 4 Jun 2021 18:01:19 +0000 (20:01 +0200)]
stacks: call check_irqs() in run_thread()
The comment above the yield() function suggests that yield()
allows interrupts for a short time. But yield() only briefly
enables interrupts if seabios was built without CONFIG_THREADS
or if yield() is called from the main thread. In order to
guarantee that the interrupts were enabled once before yield()
returns in a background thread, the main thread must call
check_irqs() before or after every thread switch. The function
run_thread() also switches threads, but the call to check_irqs()
was forgotten. Add the missing check_irqs() call.
This fixes PS/2 keyboard initialization failures.
The code in src/hw/ps2port.c relies on yield() to briefly enable
interrupts. There is a comment above the yield() function in
__ps2_command(), where the author left a remark why the call to
yield() is actually needed.
Here is one of the call sequences leading to a PS/2 keyboard
initialization failure.
ps2_keyboard_setup()
|
ret = i8042_command(I8042_CMD_CTL_TEST, param);
# This command will register an interrupt if the PS/2 device
# controller raises interrupts for replies to a controller
# command.
|
ret = ps2_kbd_command(ATKBD_CMD_RESET_BAT, param);
|
ps2_command(0, command, param);
|
ret = __ps2_command(aux, command, param);
|
// Flush any interrupts already pending.
yield();
# yield() doesn't flush interrupts if the main thread
# hasn't reached wait_threads().
|
ret = ps2_sendbyte(aux, command, 1000);
# Reset the PS/2 keyboard controller and wait for
# PS2_RET_ACK.
|
ret = ps2_recvbyte(aux, 0, 4000);
|
for (;;) {
|
status = inb(PORT_PS2_STATUS);
# I8042_STR_OBF isn't set because the keyboard self
# test reply is still on wire.
|
yield();
# After a few yield()s the keyboard interrupt fires
# and clears the I8042_STR_OBF status bit. If the
# keyboard self test reply arrives before the
# interrupt fires the keyboard reply is lost and
# ps2_recvbyte() returns after the timeout.
}
BUILD_MIN_BIOSTABLE reserves space in the f-segment. Some data
structures -- for example disk drives known to seabios -- must be
stored there, so the space available here limits the number of
devices seabios is able to manage.
This patch sets BUILD_MIN_BIOSTABLE to 8k for bios images being 256k or
larger in size. 32bit code is moved off in that case, so we have more
room in the f-segment then.
Gerd Hoffmann [Wed, 26 May 2021 07:32:10 +0000 (09:32 +0200)]
nvme: improve namespace allocation
Instead of allocating a big array upfront go probe the namespaces and
only allocate an nvme_namespace struct for those namespaces which are
actually active.
Modern binutils unconditionally tracks x86_64 ISA levels in intermediate
files in .note.gnu.property. Custom liker script does not handle the
section and complains about it:
Mike Banon [Thu, 3 Dec 2020 04:06:59 +0000 (07:06 +0300)]
Support booting USB drives with a write protect switch enabled
At least some USB drives with a write protect switch (e.g. Netac U335)
could report "MEDIUM NOT PRESENT" for a while if a write protection is
enabled. Instead of stopping the initialization attempts immediately,
stop only after getting this report for 3 times, to ensure the
successful initialization of such a "broken hardware".
David Woodhouse [Thu, 5 Nov 2020 16:09:32 +0000 (16:09 +0000)]
nvme: Clean up nvme_cmd_readwrite()
This ended up with an odd mix of recursion (albeit *mostly*
tail-recursion) and interation that could have been prettier. In
addition, while recursing it potentially adjusted op->count which is
used by callers to see the amount of I/O actually performed.
Fix it by bringing nvme_build_prpl() into the normal loop using 'i'
as the offset in the op.
Fixes: 94f0510dc ("nvme: Split requests by maximum allowed size") Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Alexander Graf [Wed, 30 Sep 2020 21:10:56 +0000 (23:10 +0200)]
nvme: Split requests by maximum allowed size
Some NVMe controllers only support small maximum request sizes, such as
the AWS EBS NVMe implementation which only supports NVMe requests of up
to 32 pages (256kb) at once.
BIOS callers can exceed those request sizes by defining sector counts
above this threshold. Currently we fall back to the bounce buffer
implementation for those. This is slow.
This patch introduces splitting logic to the NVMe I/O request code so
that every NVMe I/O request gets handled in a chunk size that is
consumable by the NVMe adapter, while maintaining the fast path PRPL
logic we just introduced.
Alexander Graf [Wed, 30 Sep 2020 21:10:55 +0000 (23:10 +0200)]
nvme: Pass large I/O requests as PRP lists
Today, we split every I/O request into at most 4kb chunks and wait for these
requests to finish. We encountered issues where the backing storage is network
based, so every I/O request needs to go over the network with associated
latency cost. A few ms of latency when loading 100MB initrd in 4kb chunks
does add up.
NVMe implements a feature to allow I/O requests spanning multiple pages,
called PRP lists. This patch takes larger I/O operations and checks if
they can be directly passed to the NVMe backing device as PRP list.
At least for grub, read operations can always be mapped directly into
PRP list items.
This reduces the number of I/O operations required during a typical boot
path by roughly a factor of 5.
Alexander Graf [Wed, 30 Sep 2020 21:10:54 +0000 (23:10 +0200)]
nvme: Allow to set PRP2
When creating a PRP based I/O request, we pass in the pointer to operate
on. Going forward, we will want to be able to pass additional pointers
though for mappings above 4k.
This patch adds a parameter to nvme_get_next_sqe() to pass in the PRP2
value of an NVMe I/O request, paving the way for a future patch to
implement PRP lists.
Signed-off-by: Alexander Graf <graf@amazon.com> Reviewed-by: Filippo Sironi <sironi@amazon.de>
Alexander Graf [Wed, 30 Sep 2020 21:10:53 +0000 (23:10 +0200)]
nvme: Record maximum allowed request size
NVMe has a limit on how many sectors it can handle at most within a single
request. Remember that number, so that in a follow-up patch, we can verify
that we don't exceed it.