Implements the small x86 specific UEFI entry point which will call
the early self relocator, adjust the initially unaligned stack and
finally jump to the architecture generic EFI stub.
Sergiu Moga [Mon, 27 Mar 2023 09:26:10 +0000 (12:26 +0300)]
support/scripts: Add `python3` script to patch fake PE header
This script allows patching of the architecture specific fake PE
headers. It only fills in the fields that UEFI firmware looks for
when validating and loading the image. Specifically, it does the
following:
- Write MS-DOS signature in the first bytes of the binary
- Write at the standard MS-DOS file offset `0x3c` the offset to the
beginning in file of the fake PE header
- Append the original ELF file that also contains the PE header
- Fill in the following fields of the Optional Header: SizeOfCode,
AddressOfEntryPoint, BaseOfCode, SizeOfImage
- Fill in the dummy PE sections, as PE, unlike ELF, is loaded by
sections:
- dummy .reloc section pointing to itsel with all fields
zeroed out except the VirtualAddress and PointerToRawData
fields which point to the section itself, to fool UEFI into
thinking this is a valid relocation.
- All PT_LOAD ELF Program Headers will be encapsulated into
PE sections with all permissions enabled (RWX)
For these sections, only the following fields are required to be filed
in: VirtualSize, VirtualAddress, SizeOfRawData, PointerToRawData.
Thus, the script fills in the bare-minimum fields, according to EDKII,
the most complete and official UEFI implementation, that are required
by an UEFI application's PE header to be considered valid and loadable.
Sergiu Moga [Tue, 28 Mar 2023 14:20:01 +0000 (17:20 +0300)]
plat/kvm/efi.c: Add support for Devicetree Blob file
This enables the EFI stub to load a Devicetree Blob file from the
same filesystem that the Unikraft image was loaded from (the EFI
System Partition) and register it as a `Memory Region Descriptor`.
The name of the `dtb` file is given through the
`CONFIG_UK_EFI_STUB_DTB_FNAME` configuration entry and it tells
the loader the name of the file to load from the `\EFI\BOOT` directory
of the EFI System Partition.
Sergiu Moga [Tue, 28 Mar 2023 14:20:01 +0000 (17:20 +0300)]
plat/kvm/efi.c: Add support for initial RAM disk file
This enables the EFI stub to load an initial RAM disk file from the
same filesystem that the Unikraft image was loaded from (the EFI
System Partition) and register it as a `Memory Region Descriptor`.
The name of the `initrd` file is given through the
`CONFIG_UK_EFI_STUB_INITRD_FNAME` configuration entry and it tells
the loader the name of the file to load from the `\EFI\BOOT` directory
of the EFI System Partition.
Sergiu Moga [Mon, 27 Mar 2023 10:08:34 +0000 (13:08 +0300)]
plat/kvm/efi.c: Add command-line arguments support
Implement support to pass command-line arguments through Unikraft's
`struct ukplat_bootinfo`.
This can be done in two ways:
1. Through the UEFI Shell when launching the image or through `qemu`'s
`-append` option.
2. Through the filesystem of the same partition (the EFI System
Partition) that the image was launched from. The loader will look
for a file with the name configured through the
`UK_EFI_STUB_CMDLINE_FNAME` in the `\EFI\BOOT' directory.
The first way, if applicable, takes priority over the second.
Sergiu Moga [Tue, 21 Mar 2023 13:31:02 +0000 (15:31 +0200)]
plat/kvm: Implement architecture generic EFI stub
Add an architecture generic EFI stub that sets up a
`struct ukplat_bootinfo`'s memory region descriptors, `bootloader`
and `bootprotocol` fields. Furthermore, the memory region descriptors
corresponding to the UEFI `Runtime Services` are gathered separately,
through UEFI's `Memory Attribute Table`, since they have to be
treated differently, in order for the `Runtime Services` to be
used properly after `exit_boot_services`.
At the end, the stub calls `uk_efi_jmp_to_kern` which each architecture
is supposed to independently implement the remaining setup for its
platform.
Sergiu Moga [Mon, 27 Mar 2023 09:44:46 +0000 (12:44 +0300)]
plat/kvm: Add `EFI_STUB` configuration entry
Add a configuration option to build the Unikernel as a valid,
loadable UEFI application. Make it depend on `ACPI` and, obviously,
on not having `PLAT_LINUXU` enabled.
Furthermore, remove default selection of the Multiboot boot protocol
if QEMU VMM is selected, now that a QEMU image can also be an EFI
image.
Add a description and proper dependencies for the
`KVM_BOOT_PROTO_LXBOOT` configuration entry. Since `Firecracker`
only supports the `Linux` 64-bit boot protocol and we do not yet
support booting through it on `QEMU`, make the dependencies
reflect that.
Add a description and proper dependencies for the
`KVM_BOOT_PROTO_MULTIBOOT` configuration entry. Since `Firecracker`
only supports the `Linux` 64-bit boot protocol and `Multiboot` is
x86 specific (not taking into consideration the `Multiboot` ported
to `ARM` for `Xen`), make the configuration entry reflect that.
Simon Kuenzer [Thu, 27 Jul 2023 20:27:42 +0000 (22:27 +0200)]
support/qemu-guest: Darwin support
This commit introduces native support for QEMU installations on
Darwin (MacOS). Apple's hypervisor framework is used for guest
acceleration instead of KVM on Linux hosts.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Thu, 27 Jul 2023 10:56:27 +0000 (12:56 +0200)]
support/qemu-guest: Remove SGA bios parameter and warning
Starting with QEMU version 8.0, the `-device sga` parameter is removed from
the command line because it is no longer needed. These versions include a
SeaBIOS version that contains native support for serial consoles. If an
older version of QEMU is used with `qemu-guest`, the BIOS output may no
longer be visible but the guest would still be able to boot.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Thu, 19 Jan 2023 00:58:27 +0000 (01:58 +0100)]
support/qemu-guest: Enable X2APIC for TCG mode
Enables X2APIC when TCG mode is selected. As soon as QEMU is available,
this should enable running a recent x86_64 version of Unikraft on devices
without x86 hardware virtualization support (e.g., Arm).
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Thu, 27 Jul 2023 20:44:50 +0000 (22:44 +0200)]
build: Print actual build directory name with `properclean`
This commit corrects the output of the `properclean` make target. Instead
of just outputting `RM build/`, the currently configured build directory is
printed.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
This commit introduces a replacement for `build/config-submenu.sh`. The
main reason for the rewrite was the incompatibilities on Darwin. The new
script mainly uses `bash`-internal functions to avoid incompatibilities
with third-party commands called by the script.
Checkpatch-Ignore: LONG_LINE_STRING
Checkpatch-Ignore: LONG_LINE Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Thu, 27 Jul 2023 07:46:00 +0000 (09:46 +0200)]
build: Robust version checking helpers with `printf`
This commit updates the compiler version checking helpers so that they also
work under environments like Darwin. It turns out that the behavior of
`echo` can be different in different environments, while `printf` seems to
be well defined.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Wed, 11 Jan 2023 00:51:41 +0000 (01:51 +0100)]
build: Use gsed on Darwin (MacOS)
The build system depends on the GNU version of `sed`. Since this is
typically installed with `gsed` on Darwin environments, we call `gsed`
in such a case.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Simon Kuenzer [Wed, 11 Jan 2023 00:47:53 +0000 (01:47 +0100)]
build: Detect host environment
This commit intorduces thwe build variable $(HOSTOSENV) which contains the
host environmenrt (e.g., Linux, Darwin) so that the build system is able to
behave appropriately to the details of the host environment.
Checkpatch-Ignore: FSF_MAILING_ADDRESS Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
This commit creates the build folder substructure ealier, except that a
target was called that does not require any KConfig involvement. This
allows us to move generated Config.uk snippets under `build/kconfig` in
order to clean up the root of the build folder.
Additionally, this commit corrects
commit b8872a6b572c ("build: first level subdirectories only with `mk_sub_build_dir`")
where the `uk` subdirectory was no longer created under
`$(BUILD_ROOT)/include`. The commit also introduces consistent use
of the `$(MKDIR)` command alias.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Alexander Jung <alex@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1034
Sergiu Moga [Mon, 15 May 2023 06:08:16 +0000 (09:08 +0300)]
plat/kvm/x86: Make sure we allocate the `SIPI Vector` as reserved
Now that we can dynamically decide the `SIPI Vector`, make sure
we do this allocation when all the memory regions are properly
setup in the `Multiboot1` memory region setup method.
Sergiu Moga [Mon, 15 May 2023 06:05:29 +0000 (09:05 +0300)]
plat/kvm/x86: Make sure we have the legacy x86 high memory reserved
Since there is a chance that the previous boot phase did not mark
this area as reserved through a memory region descriptor, make sure
we do it ourselved.
Since this memory region is very low, we do this insertion when this
list is empty to avoid unnecessary iterations.
Sergiu Moga [Mon, 15 May 2023 06:02:07 +0000 (09:02 +0300)]
plat/kvm/x86: Keep `Multiboot1` memory regions below `1 MiB`
Keep the `e820` enries below `1 MiB` that we were passed through
`Multiboot1` and rely on the unmappings memory region descriptor to
tell us later what to unmap.
plat/linuxu: Drop no longer used `liblinuxuplat_opts` structure
Now that we rely on `ukplat_memregion` generic API's, we have no
more need for linuxu configuration specific structures when it comes
to memory regions.
plat: Integrate `ukplat_memregion` API's into `ARM64` builds
Rely on the generic `ukplat_memregion` API's for memory region
management and drop `_libkvmplat_cfg` usage in favor of the
`bootinfo` getters.
Now that `ukplat_memregion`'s are usable on `ARM64`, proper
paging support can be used instead of the overly-complicated
memory re-adjustment that was previously done in `_init_dtb_mem`.
Instead, we rely on `ukplat_memregion_alloc` for proper memory
region management.
Another important, notable change, is that of changing
`arm64_bpt_l0_pt0` to contain only one, unified, entry of `pte_fill`.
This reduced page permission granularity allows us further flexibility
in where the Kernel is placed so that we are truly position independent.
One may be right in saying that this looks not so secure, having all that
memory mapped with all permissions, however this is not as important when
the two existing possible use-cases are taken into consideration:
- Paging API is disabled and we will either way have wrong permissions
everywhere anyway
- Paging API is enabled and this area will be unmapped anyway and new
proper permissions will be used instead
Therefore, until the time comes when we manage to dynamically get rid
of the static boot page tables, we will align ARM with x86 and have the
static boot page tables all mapped with all permissions w.r.t. to
Kernel placement.
Furthermore, now that ARM64 can also use the heap base
`CONFIG_LIBUKBOOT_HEAP_BASE` configuration option, drop the x86
dependency.
Sergiu Moga [Sun, 21 May 2023 12:52:01 +0000 (15:52 +0300)]
plat/common/paging.c: For `ARM64`, don't map outside unmap region
Since on `ARM64` we map all of the peripherals' I/O space statically,
make sure that we do not map the regions in the list that are outside
our globally declared unmap memory region descriptor.
Sergiu Moga [Mon, 15 May 2023 05:58:44 +0000 (08:58 +0300)]
plat/kvm/arm: Hardcode a memory region descriptor for unmappings
Add a memory region descriptor to indicate the range we are meant to
unmap when paging is initialized and we are getting rid of the
initial static boot page tables.
Sergiu Moga [Sat, 20 May 2023 14:28:36 +0000 (17:28 +0300)]
plat/common/paging.c: Do not ignore `EEXIST` errors while mapping
Commit f0ae44de725f ("plat/kvm/x86: Workaround re-mapping low-mem areas")
introduced this check to ignore `EEXIST` errors while mapping the
memory regions marked as `UKPLAT_MEMRF_MAP` due to the fact that,
before the previous commit ("plat: Generalize memory initialization"),
Unikraft was ignoring all memory regions in the first 1 MiB and was
manually hardcoding the mappings in the static boot page tables.
This was done because the `multiboot` boot protocol would not report
the legacy mapped `BIOS ROM` area or the `VGA Framebuffer` in its
e820 map's reserved regions.
We no longer need this anymore and, if anything, it helps
us ensure that the setup of the boot environment's mappings is done
as we expect it to be.
Sergiu Moga [Mon, 15 May 2023 05:44:45 +0000 (08:44 +0300)]
plat: Generalize memory initialization
Re-define `mem_init` into `ukplat_mem_init` that does the same thing,
but does not assume what regions to keep or unmap. Instead, it relies
on the platform/architecture defined unmap memory region to do this.
plat/common/paging.c: Flush TLB only if we are modifying current PT
Flushing the TLB introduces an unnecessary performance penalty if we
are modifying a page table that is different from our current one.
Therefore, ensure that the TLB is flushed only if the page table we
are currently modifying is the same as the one we currently have
active.
plat: Move paging initialization to a platform common location
Since the `paging_init` function of the x86 KVM subsystem is written
in a generic manner, it can be reused by other architectures or
platforms. Thus, move it to a platform generic location and rename
it with a more intuitive name to align it with the other function
declarations of `paging.c`.
Furthermore, only call `ukplat_pt_set_active` at the very end, to avoid
unnecessary `TLB` flushes.
Sergiu Moga [Mon, 15 May 2023 05:40:52 +0000 (08:40 +0300)]
plat/kvm/x86: Hardcode a memory region descriptor for unmappings
Add a memory region descriptor to indicate the range we are meant to
unmap when paging is initialized and we are getting rid of the
initial static boot page tables.
Sergiu Moga [Mon, 15 May 2023 05:32:57 +0000 (08:32 +0300)]
plat/common: Implement method to dynamically assign `SIPI Vector`
Add an inline function that simply allocates a reserved memory region
meant for the `SIPI Vector`. If this function fails we wouldn"t have
been able to do this properly with the old way anyway (hardcoded
`SIPI Vector` at `0x8000`).
This should only be called after the memory region list has been
fully built and coalesced.
plat/common: Make `ukplat_memregion_alloc` aware of memory holes
In case we are somehow able to allocate memory from an in-image
memory hole, make sure we do not give it the `UKPLAT_MEMRF_MAP`
flag, since it is already mapped, which causes the paging
initialization phase to exit with an `-EEXIST` error code.
plat/common/arm: Make `lcpu_arch_jump_to` not SMP dependent
Since `lcpu_arch_jump_to` can be used with and without
`CONFIG_HAVE_SMP`, move it to a place where the preprocessing
phase does not remove the function from the final image.
plat/kvm: Move `ukplat_memregion` API's to platform common location
Since these API's can be used on any platform and any architecture
subsystem that integrates `struct ukplat_bootinfo` structure, move
their definitions to a platform common place in the tree.
plat/linuxu: Move `initrd` and `heap` initialization to `setup.c`
Initialize the `heap` and `initrd` as soon as possible and move the
respective functions to `setup.c`, as that is the only place where
they are to be used.
plat/kvm/x86: Coalesce the memory region descriptor list
Ensure that there are no overlapping memory regions in the memory
region descriptor list by coalescing them after finishing the
insertion phase of the memory regions reported through the multiboot
protocol.
plat/common: Add a memory region descriptor coalescing method
Implement a function that, given a list of memory region descriptors,
coalesces them based on a priority. The lowest priority is that of
the free memory regions, while the highest is that of the reserved
memory regions. This priority is required whenever it comes to
splitting memory regions that overlap. Thus, a memory region whose
priority is higher, gets to keep its overlapping fragment, while the
other is either split or reduced.
plat/kvm/x86: Mark the thrashed command-line as a kernel resource
Since this buffer is only meant for `ukplat_entry_argp`, which will
thrash it as a result of parsing it to obtain `argc`/`argv`, mark it
as a kernel related memory region instead (`UKPLAT_MEMRT_KERNEL`).
Since `bootmemory_palloc` can be used by any platform and
architecture and can come in handy when wanting to allocate a memory
region descriptor on the spot, that is not part of the original
memory map, move it to a more general location.
Furthermore, rename the function to `ukplat_memregion_alloc` to make
its usage-case more intuitive.
plat/kvm/x86: Fix `bootmemory_palloc` keeping empty memory region
Whenever a memory region allocation request done through
`bootmemory_palloc` would require a length equal to one of the
available memory regions, a new, equivalent, memory region would
be created and the original one would be left empty, thus breaking
the memory allocator.
Sergiu Moga [Tue, 21 Mar 2023 10:49:59 +0000 (12:49 +0200)]
plat/common: Make the amount of Memory Region Descriptors configurable
Sice `bootinfo` statically allocates space in the final image to store
Memory Region Descriptors at runtime and their actual amount is highly
variable depending on the platform, allow the user to configure a
precise maximum amount of such descriptors.
This allows flexibility in the size of the final image. One can check
at runtime whether `-ENOMEM` is returned by the memory region insertion
function or not and adjust `CONFIG_UKPLAT_MEMREGION_MAX_COUNT` accordingly.
Add into the build process the sources required for integrating
the `struct ukplat_bootinfo` structure into `xen` images, as well
as the call to `build_bootinfo` method.
plat/linuxu: Integrate `bootinfo` into `linuxu` builds
Add into the build process the sources required for integrating
the `struct ukplat_bootinfo` structure into `linuxu` images, as well
as the call to `build_bootinfo` method.
Implement the equivalent page fault intermediary handler of the x86
one, by using the `AArch64` specific equivalent bitfields to check
for the type of the fault.
Sergiu Moga [Thu, 10 Aug 2023 15:34:40 +0000 (18:34 +0300)]
plat/linuxu: Add dependency on `PIE` not being enabled
Building `linuxu` with the `PIE` build options results in errors
being issued as the way we currently implement PIE is not suited
for this platform yet.
Therefore, make sure that it is impossible to build `linuxu` as
long as `PIE` is enabled.
Sergiu Moga [Wed, 22 Mar 2023 18:57:22 +0000 (20:57 +0200)]
plat/kvm/x86: Make SMP init code resolve its own `start16` relocations
Before waking up the secondary cores, the SMP initialization code copies
the, now position independent through `uk_reloc`, 16-bit and 32-bit
bootstrapping code to a physical page, in the lower 1 Mib of physical
memory. Since, at this point of execution, the immediate values used by
this bootstrapping code take the form of `UK_RELOC_PLACEHOLDER`, they
will need to be properly resolved after being moved into lower memory.
Therefore, add a new locally defined `struct uk_reloc` array holding the
hardcoded corresponding entries of these relocations, adapted to
reference the desired relocation address. Use this array after memory
copying the bootstrapping code to lower memory to resolve its
corresponding `start16` relocations.
Encode these entries with the help of `start16` related macro's and
since these definitions are starting to visually occupy a lot of space,
move everything to `start16_helpers.h`, a separate header file.
Sergiu Moga [Wed, 22 Mar 2023 18:45:52 +0000 (20:45 +0200)]
plat/kvm/x86: Make the Unikernel position independent
Replace all `mov`'s and data declarations (`.long`, `.quad` etc.) that
use absolute symbol references with the equivalent `ur_*` macro's.
One exception to this type of relocation is going to be the 16-bit
code, as there is no point in relocating it before SMP initialization
without even knowing the physical address in the first 1MiB where the
secondary cores are to start execution. So, implement dedicated `ur_*`
macro's whose only use-case is the 16-bit code to cope with its
existence. Furthermore, add an exception to these `start16` relocation
symbols in the script.
Since, on this platform, we make use of static page tables, these were
also made relocatable through the `ur_pte` macro.
plat/kvm/x86: Move `x86_start16_addr` out of `.data.boot`
Since `.data.boot` is meant mainly for boot specific data such as
a Multiboot header, move this unrelated `x86_start16_addr` out of
this section. The reason we do this is because this might cause
conflict with actual boot related data (e.g. `x86_start16_addr` be
placed at the beginning of the binary instead of the actual boot
related header data).
Similarly to the 64-bit C code `do_uk_reloc` self relocator,
`do_uk_reloc32` is a macro that takes as argument a value equal or
different from 0 to specify whether a stack is supplied or not. If the
stack is not supplied, the macro can generate its own minimal scratch
stack. Furthermore, the base virtual address is expected in %esi:%edi
(lower 32 bits in %edi and higher 32 bits in %esi), as well as the
physical base address in the %edx register. Since we are in Protected
Mode, we can assume a 32-bit register is enough to hold this address.
Due to the high number of references to absolute symbol values in the
16-bit and 32-bit bootstrap code, a new macro, `ur_mov`, was introduced.
The new macro creates a new type of `_uk_reloc_` symbol, with an `imm`
suffix, to show that it is placed right after a `mov` instruction whose
immediate value must be relocated/patched. Thus, update `mkukreloc.py`
accordingly.
Furthermore, although useless, for completeness's sake add a 64-bit
variant. This variant does a `movabs` with the placeholder to
guarantee that the size of the immediate is 8 bytes.
Sergiu Moga [Tue, 28 Feb 2023 20:02:04 +0000 (22:02 +0200)]
plat/kvm/arm: Make the Unikernel position independent
Replace the static page table entries with relocatable ones through
the `ur_pte` macro. Thus, we make sure that relocations take place
before enabling the MMU by making a call to `do_uk_reloc` self relocator
method right before jumping to `start_mmu`.
Furthermore, make `Config.uk` select `LIBUKRELOC`.
Sergiu Moga [Tue, 28 Feb 2023 19:54:24 +0000 (21:54 +0200)]
lin/ukreloc: Add `ur_pte` macro to create relocatable PTE's
In order to cope with our static page tables used by the bootstrap code
we need to also be able to relocate our page table entries if we want
to be position independent on platforms that makes use of such page
tables.
This macro makes use of the already existing `ur_data` macro to make the
script create the usual `struct uk_reloc`. However a small modification
to the script makes the script do an additional lookup for corresponding
`pte_attr` symbols that `ur_pte` creates, so that it knows to also add
the page table entry attributes to the final value that the self
relocator uses to properly resolve a relocation.
Sergiu Moga [Tue, 28 Feb 2023 19:03:20 +0000 (21:03 +0200)]
plat/xen/x86: Make the Unikernel position independent
In order to be position independent, `struct uk_reloc` and the self
relocator based off it are used. Thus, the bootstrapping code calls
`do_uk_reloc` as early as possible, by using the page-sized default
stack that the Hypervisor passes onto us, because the actual stack
that we use, is also used by the Hypervisor and any early world switch
may cause the guest to crash.
Furthermore, the obvious absolute symbol references have been replaced
with their instruction pointer relative, position independent,
equivalents. The `r9` register has been used as a scratch register in
this case since, after a brief code analysis, it's been found to not be
used anywhere else during bootstrap.
Sergiu Moga [Wed, 22 Mar 2023 18:51:57 +0000 (20:51 +0200)]
lib/ukreloc: Implement a `struct uk_reloc` based self relocator
Add `do_uk_reloc` a method that parses the `.uk_reloc` section and
applies the relocations accordingly. As arguments, this method receives
a base physical address `r_paddr` for relocations that employ the
`UKRELOC_FLAGS_PHYS_REL` flag and a base virtual address `r_vaddr`
for those that do not want to be relocated against a physical address,
but rather by a given virtual address (e.g. can be used with KASLR).
In ARM's case, the `-fPIC`/`-fPIE`, with the help of `adrp`,
makes the code reference the page-aligned address of the section
and the offset this symbol has in this section where the relocations
are to be applied to and thus the desired relocated value is
obtained. Thus, unlike `x86` that simply does a `%rip` relative
access, ARM ends up dereferencing that in-section offset when trying
to get the runtime value of `__BASE_ADDR` which creates a chicken-egg
problem: we apply the relocations based on the current base address,
but in order to find the current base address we need to relocate
the value that is placed at the address of where the value of the
base address is supposed to be.
We solve this with the help of the `get_rt_addr()` function, which
forces an `adrp`, `add :lo12:` assembly sequence, which leads to
achieving something similar to x86's `%rip` relative addressing.
A case that may explain the need for `r_paddr` and `r_vaddr` would be
a situation where the previous program loader would place us at a random
physical address (r_paddr) and we then want to apply our own virtual
mappings for KASLR (r_vaddr), over the already applied virtual mappings,
if any.
Furthermore, right before applying relocations, the initial, added
through `mkbootinfo.py`, memory region descriptors are also relocated.
This is done by subtracting the initial link time base address, that
has been statically resolved by the linker in `lt_baddr`, and adding
the runtime base address.
Note: `lt_baddr` will generate a relocation so it will be overwritten
by the relocation loop. We want to keep it in case someone may want
to run the relocator multiple times for the same image. So, instead
of slowing down the relocator by checking each relocation against
`lt_baddr`'s address so that we do not overwrite it, simply back
it up in a temporary variable and restore it after relocations are
done.
Sergiu Moga [Wed, 22 Mar 2023 18:52:47 +0000 (20:52 +0200)]
lib/ukreloc: Add inline function to apply a `uk_reloc` entry
Implement a simple inline function to properly apply, depending on the
size of the relocation, a `uk_reloc` entry.
`apply_uk_reloc` may cause `UBSAN` to issue false positives, in
the case of the `x86` architecture which leads to the Unikernel crashing
since the serial console is not initialized yet, as it is mostly intended
to be called within the early self relocator. So make sure that
`UBSAN` does not touch `apply_uk_reloc` because it makes sense, for
`x86`, to make unaligned accesses (especially in Kernel Space) in order
to resolve non-compliant relocations.
Sergiu Moga [Sun, 19 Mar 2023 17:17:08 +0000 (19:17 +0200)]
lib/ukreloc: Add ld script for sections required for a relocatable image
Add a separate Linker Script containing all the required sections
to achieve positional independence and the ability to self relocate.
The `.uk_reloc`, `.dynsym` and `.rela.dyn` sections will be part of the
main data segment and the last two will be stripped in the end, without
causing a hole in memory, due to the fact that `.uk_reloc` already
contains the forced `struct uk_reloc` relocations and, after being
updated, will also contain the `struct uk_reloc` equivalents of
`.rela.dyn`'s entries, which will be of the same size, so the `.bss`
section will stay in place.
The `.dynamic` and `.dynstr` sections become irrelevant in the final image
so we will strip them as well. In order not to cause a memory hole, place
them after `.comment` section.
Sergiu Moga [Wed, 22 Mar 2023 18:53:55 +0000 (20:53 +0200)]
lib: Introduce `libukreloc`
In order to start supporting positional independence we need to be able
to self relocate. Thus, define a new custom relocation related structure
`struct uk_reloc` that is meant to hold the offset in memory where the
relocation is to be applied, the offset of the value from the original
symbol value computed by the linker through the linker script's base
reference address, the size of the relocation and a flags field
respectively (for now we only have the `UKRELOC_FLAGS_PHYS_REL` flag
meant to indicate whether the relocation is to be applied based on a
physical address or not, which comes in handy in bootstrap code).
The final binary blob that the `mkukreloc.py` script builds is made of
a signature (UKRELOC_SIGNATURE), `struct uk_reloc` entries appended
to each other and ends with a zeroed out `struct uk_reloc` to act as a
sentinel. This binary blob will be placed in the binary through the
`.uk_reloc` ELF section, updated through `objcopy`. Therefore, in order
for this forced section update to not mess up linker script exported
symbols, this section must be placed towards the end of the binary.
The binary blob is obtained by parsing all of the `.rela.dyn` entries,
converting them into `struct uk_reloc` entries and by searching
through the debug image's symbols for `_uk_reloc_` symbols created with
the help of the architecture independent helper assembly macro:
`ur_data`. This macro has been added to aid in replacing the references
to absolute symbol values with position independent equivalents that,
instead of referencing this very symbol, it places a placeholder value
(UKRELOC_PLACEHOLDER) and generates a symbol to be parsed by the script
to build a uk_reloc entry. Furthermore, it increases the size of the
`.uk_reloc` section by one entry through the `ur_sec_updt` macro. Note
that `ur_sec_updt` also inserts a dummy relocation entry in `.uk_reloc`
in order to force the linker to not optimize away the symbols. A unique
`uk_reloc` symbol will be generated through the `ur_sym` macro so that
an `ur_*` macro can be used on the same symbol more than once.
Consider the following usage example:
`ur_data quad, symbol, 8, _phys`
This will result in the `nm` tool, whose output the script will parse,
generating such entry:
`0000000000100106 T symbol_uk_reloc_data8_phys`
For `mkukreloc.py` this means, with a linker script reference address
of 0x100000:
```
struct uk_reloc {
__u64 r_mem_off = 0x106; // 0x100106 - 0x100000
__u64 r_addr;
__u32 r_sz = 8; // symbol_uk_reloc_data[8]_phys
__u32 flags = UKRELOC_FLAGS_PHYS_REL; // symbol_uk_reloc_data8[_phys]
} __packed;
```
In order for actual value of `symbol` to be found, since its name is at
the beginning of the generated symbol, the script will now look for such
entry:
`0000000000100078 T symbol` // from [symbol]_uk_reloc_data8_phys
and thus, `0x78` becomes the value of `r_addr` and the
`struct uk_reloc` is completed and appended to the binary blob with the
others.
`mkukreloc.py` will be invoked through `build_uk_reloc` Makefile definition
and the "section.*lma.*adjusted to.*" type of `objcopy` warnings will be
ignored since they are intended and harmless. The script will also check
that the distance between `.uk_reloc` and `.bss` is enough to not cause
an adjustment or spill. In case it is not enough, then the script will
raise an Exception.
All of this functionality shall be contained within the `libukreloc`
library.
Sergiu Moga [Tue, 28 Feb 2023 19:38:34 +0000 (21:38 +0200)]
build: Add configuration option to build the Unikernel as a static PIE
Add compiler and linker options to properly build the Unikernel as a
static PIE. For now, only KVM and Xen are supported.
Strip `*dyn*` related ELF sections from the final binary. This will cause
tools such as `readelf` or `objdump` to issue some warnings or errors
when using on the final image due to the stripped unnecessary sections,
but they will still work and, as a trade off, the image will be as little
as it can be.
Sergiu Moga [Fri, 24 Feb 2023 17:59:50 +0000 (19:59 +0200)]
plat/kvm: Drop `elf32-i386` output target linker flag
Remove the `elf32-i386` linker flag that forces the final Unikernel
binary output as a 32-bit kernel.
Since `qemu` currently boots us through `multiboot`, a boot protocol
that does not support 64-bit kernels, add a new support script
to be run on the final image. This `python` script will convert the
Unikernel's `ELF64` headers to their `ELF32` equivalents, thus fooling
the loader into thinking it's a 32-bit kernel. Since most of the
addresses contained into the fields of the headers are assumed to fit into
at most 4 bytes, the script simply takes those values and directly places
them in their corresponding `ELF32` header fields without truncating.
The actual `ELF32` header will be prepended to the final binary together
with the equivalent `ELF32` Program Headers and the Section Headers will
be appended to the end of the binary so that we can keep the `Multiboot`
header in the first 8192, as the specification requires. Although the
specification does not require the Program Headers be in the first
8192 bytes, GRUB seems to want it to be like that.
The script will be called using a newly added `Makefile.rules` file meant
to contain definitions common to all platforms.
Since, at this time, the only supported boot protocol on x86 KVM QEMU is
`multiboot`, enable this script for x86 KVM QEMU Multiboot builds.
Sergiu Moga [Mon, 15 May 2023 06:44:37 +0000 (09:44 +0300)]
plat/kvm: Compute `ukplat_bootinfo` structure on the final binary
Since it is the final binary that is loaded and executed, make sure
that `build_bootinfo` is called on the final stripped binary, instead
of the debug image, whose segments' sizes may vary after stripping.
support/scripts/mkbootinfo.py: Ensure Program Headers are sorted
Due to the placements of the `tls` and `tls_load` Program Headers,
The last two memory regions inserted in bootinfo structures will
always end up unordered. Thus, make sure they are ordered by their
address first, before building the memory regions and inserting
them.
When the buddy allocator chooses a page order to allocate it can (for
pathologically large values of num_pages) start from a value larger than
FREELIST_SIZE, leading to out-of-bounds access of the freelist array.
This change hardens the out-of-memory check to prevent this.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Eduard-Florin Mihailescu <mihailescu.eduard@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #932
This adds a Kconfig option to enable sanity checking of buddy allocator
free lists at runtime. Check points are placed at the beginning and end
of functions that operate on the free lists. When the option is disabled
the checks default to zero-overhead no-ops.
In addition, this adds an assert that returned allocated pages are
correctly aligned to their buddy order.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Eduard-Florin Mihailescu <mihailescu.eduard@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #932
Simon Kuenzer [Tue, 8 Aug 2023 12:08:23 +0000 (14:08 +0200)]
Makefile: Include Makefile.build from main Makefile
This commit ensures that the inclusion of sub-'Makefile.build' is done with
the main Makefile. This is done for consistency reasons.
This basically adopts
commit db20b80c47b2 ("Makefile: Allow external Makefile.build")
from GitHub PR #1005.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #1028
Marco Schlumpp [Tue, 16 May 2023 14:45:49 +0000 (16:45 +0200)]
plat/kvm: Add ectx assertion in interrupt handler
This checks that the extended registers were not modified within the
interrupt service routine. This is not free and therefore can be
toggled via a KConfig option.
Signed-off-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Razvan Deaconescu <razvand@unikraft.io> Approved-by: Simon Kuenzer <simon@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #897
Marco Schlumpp [Tue, 16 May 2023 14:43:21 +0000 (16:43 +0200)]
arch/x86: Add function to check for unmodified ectx
This can be used in situtations where the executed code must not touch
the extended registers. For example, interrupt handlers in Unikraft do
save these and therefore also cannot modify them. Doing so is usually
breaking the interrupted code.
Signed-off-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Razvan Deaconescu <razvand@unikraft.io> Approved-by: Simon Kuenzer <simon@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #897
Now that we can dynamically mount any kind of volume through a more
generic format through the commandline, there is an unnecessary
overlap between `vfs.{rootfs, rootdef, rootops,rootflags}` and
defining `rootfs` related arguments through `vfs.fstab` volumes.
Therefore, deprecate `vfs.{rootfs, rootdef, rootops,rootflags}` in
favor of `vfs.fstab`.
Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Reviewed-by: Razvan Virtan <virtanrazvan@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Reviewed-by: Simon Kuenzer <simon@unikraft.io> Approved-by: Simon Kuenzer <simon@unikraft.io> Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #979