Sergiu Moga [Mon, 3 Jun 2024 10:57:23 +0000 (13:57 +0300)]
plat/common/x86: Sanitize the ECTX slot on syscall entry
Commit c716bcca4822 ("{lib,arch,plat}: Redo syscall ctx's and swapgs logic"),
following a rework of architecture specific contexts and syscall entries,
by mistake removed the ECTX sanitization at the beginning of system calls.
This can result in #GP on x86 if the XSAVE header happens to be dirty.
Thus, bring this sanitization back.
Michalis Pappas [Sat, 1 Jun 2024 14:50:53 +0000 (16:50 +0200)]
arch/arm64: Add checks for min clang version
Add conditionals for clang to fix the build when arch features are
enabled. Set min clang version to 14 on all features as that is the
first clang version that supports branch-protection on arm64, and
for the rest of the features the only version tested.
Signed-off-by: Michalis Pappas <michalis@unikraft.io> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Reviewed-by: Maria Sfiraiala <maria.sfiraiala@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1439
Michalis Pappas [Tue, 26 Dec 2023 16:01:14 +0000 (17:01 +0100)]
drivers/uktty/ns16550: Add early_init
Add early device init. When paging is enabled, this adds an mrd for
the ns16550 MMIO region to bootinfo so that the region is not unmapped
during paged memory init.
Michalis Pappas [Sun, 31 Mar 2024 15:42:14 +0000 (17:42 +0200)]
drivers/uktty/ns16550: Move driver initialization to uk_inittab
Move console initialization with the rest of devices at
UK_INIT_CLASS_SYS. Since the pf bus does not support priority
levels, and the console should start before the pf bus to
allow drivers to print their status, register directly with
init instead of the pf bus.
Michalis Pappas [Sun, 31 Mar 2024 15:31:01 +0000 (17:31 +0200)]
drivers/uktty/ns16550: Map device region at runtime
Map the ns16550 region at runtime. This is now required as paged memory
init unmaps any memory not registered by early devices, thus if early
UART is not enabled the device regions is not mapped.
Michalis Pappas [Sat, 30 Mar 2024 13:34:43 +0000 (14:34 +0100)]
drivers/uktty/pl011: Add early init
Add early device init. When paging is enabled, this adds an mrd for
the pl011 MMIO region to bootinfo so that the region is not unmapped
during paged memory init.
Michalis Pappas [Sat, 30 Mar 2024 13:06:09 +0000 (14:06 +0100)]
drivers/uktty/pl011: Move driver initialization to initttab
Move console initialization with the rest of devices at
UK_INIT_CLASS_SYS. Since the pf bus does not support priority
levels, and the console should start before the pf bus to
allow drivers to print their status, register directly with
init instead of the pf bus.
Michalis Pappas [Sun, 31 Mar 2024 12:54:36 +0000 (14:54 +0200)]
drivers/uktty/pl011: Map region on runtime
Map the pl011 region at runtime. This is now required as paged memory
init unmaps any memory not registered by early devices, thus if early
UART is not enabled the device regions is not mapped.
Michalis Pappas [Fri, 8 Mar 2024 05:22:12 +0000 (06:22 +0100)]
plat/kvm/arm: Add early init boot stage
Add boot stage for early initialization. This is invoked at the end
of early boot code, before passing control to the platform.
Early devices can use the APIs provided by the boot protocol to obtain
any information required, such as device regions and the kernel
command line.
Drivers that register with early_init() should append mrds of their
MMIO regions to bootinfo so that these regions are not unmapped
during paged memory init. These mrds must use the newly introduced
UKPLAT_MEMRT_DEVICE type.
Notice that early drivers should not call ukplat_bootinfo_coalesce(),
as mrd coalescing is performed once at the end of early_init().
Michalis Pappas [Wed, 15 May 2024 13:30:15 +0000 (15:30 +0200)]
plat/kvm/x86: Coalesce bootinfo at EFI post
With bootinfo coalesce having moved out of EFI common code,
and with kvm/x86 lacking an early init bootstage, do the coalescing
as the last part at EFI post before jumping to kernel.
Remove coalescing from EFI bootinfo setup to allow coalescing to
happen at a boot protocol agnostic way at early_init(), after the
initialization of early devices.
Michalis Pappas [Tue, 14 May 2024 16:01:45 +0000 (18:01 +0200)]
plat/common: Move coalesce out of bootinfo fdt setup
Remove coalescing out of bootinfo fdt setup to allow coalescing happen
at a boot protocol agnostic way at early_init(), after the initialization
of early devices.
Break down bootinfo_fdt_setup() into a pre and post coalesce functions
as the latter call ukplat_memory_alloc() which operates on ordered
regions.
Michalis Pappas [Mon, 25 Dec 2023 12:06:41 +0000 (13:06 +0100)]
plat/common: Add UKPLAT_MEMRT_DEVICE type
Regions of this type are added by device drivers that implement
an early init. Specifically, upon completion of the earlyinit boot
stage, device regions are expected to be mapped with appropriate
protections, and additionally be added to bootinfo using the
UKPLAT_MEMRT_DEVICE type.
drivers/ukbus/platform: Update uk_bus_bf_devmap() to operate per-page
Update uk_bus_pf_devmap() to handle a multi-page region page-by-page,
to avoid an error caused by a partially mapped device region, which
would cause ukplat_page_map() would return EEXIST and in turn cause
the subsequent ukplat_page_set_attr() to fail.
plat/common: Move vaddr check to callers of pgarch_page_mapx()
Move vaddr check from pgarch_page_mapx() to its callers, as that
function is also used to map the direct-mapped region, the vaddr
of which is past (__VADDR_MAX - len).
Rework the initialization of paged memory to provide a more flexible
implementation that is capable of handling regions beyond the limits
defined in the boot pagetables. The motivation for this change is to
allow mapping device regions that are unknown at compile-time, such
as Unprotected IPA Alias regions of Arm CCA Realms, the address of
which depends on the executing platform.
Under the new scheme bootinfo is reduced to only contain mrds that
correspond to valid memory regions. This deprecates the unmap_mrd
region and the UKPLAT_MEMRF_MAP / UKPLAT_MEMRF_UNMAP mrd flags.
Moreover, the boot pagetables are no longer updated during paged
memory init, but instead are replaced with a new pagetable that
initialized with the regions defined in bootinfo. Besides the
additional flexibility, this implementation has the potential of
some performance improvement as it removes expensive TLB flush
operations associated with unmap.
arch/arm64: Add definitions for block-size mappings
VMSAv8-64 does not provide a naming scheme for the block
size mapped by PT block descriptors at various translation
levels. Moreover, the block size varies depending on the
size of the translation granule.
To provide granularity agnostic definitions, use the
x86_64 terminology of Large / Huge pages.
plat/common/arm64: Set DIRECTMAP_AREA_END to the end of low VA range
The direct-mapped area maps the first 512GiB of the address space
to an architecture-defined region. In arm64 that uses the highest
512GiB of the low VA range. Update DIRECTMAP_AREA_END to correctly
specify the end of the low VA range.
Andrei Stan [Wed, 29 May 2024 20:56:33 +0000 (23:56 +0300)]
plat/xen: Remove redundant memory region
The reserved virtual address space for mappings was added in the global
list of memory regions. The constraint for tracked regions to be
page aligned caused the max physical address to get rounded to 0x0.
This removes the region from the list, thus side stepping the requirement.
Signed-off-by: Andrei Stan <andreistan2003@gmail.com> Reviewed-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Razvan Virtan <virtanrazvan@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1434
Andrei Tatar [Fri, 31 May 2024 13:45:19 +0000 (15:45 +0200)]
lib/posix-poll: Fix finalizer duplication in epoll
Commit 7b2e38171 (lib/posix-poll: Autoremove closed files from epoll)
introduced a logic error that would register a file finalizer on every
call to EPOLL_CTL_MOD. This lead to use-after-free errors when a file
was removed from epoll.
This change is a quick fix to this error. A more elegant refactoring of
epoll code should be considered when vfscore shim is no longer required.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1438
Andrei Tatar [Wed, 27 Mar 2024 14:06:40 +0000 (15:06 +0100)]
lib/ukfile: Remove padding from struct uk_statx
This change removes the padding fields from uk_statx, minimizing memory
waste when allocating inside the kernel.
Syscalls never reveal these kernel structs, and userspace allocates a
struct statx dictated by its libc.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1387
This assertion checks for mrd->pg_count parameter that was introduced
in the following commit: ad52a90f (uk/plat/memory: Introduce `pg_off` and `pg_count` memregion
fields, 2023-10-28)
Sergiu Moga [Sun, 26 May 2024 10:28:45 +0000 (13:28 +0300)]
{include, plat/common}: Add asm guards in `essentials.h`
Some macros of `essentials.h` can be used in assembly sources as well,
while others not so much. Allow one to safely include and use the former
by adding assembly guards in the header for the macros in question
Sergiu Moga [Sun, 3 Mar 2024 15:00:20 +0000 (17:00 +0200)]
{lib,arch,plat}: Redo syscall ctx's and `swapgs` logic
To make git bisecting and rebasing significantly easier and avoid
builds breaking across commits, this whole set of changes shall be
introduced under one single all encompassing commit.
Following the introduction of the concept of auxiliary stack pointers,
swapgs, `struct uk_syscall_ctx` and `struct ukarch_sysregs`, a number
of things have emerged:
- the aforemenetioned structs are very generic so they should be moved
under libcontext (arch/)
- swapgs introduces a significant inconsistency between ARM64 and x86_64
as we never know during an exception the state of
MSR_GS_BASE/MSR_KERNEL_GS_BASE
- auxiliary stack pointers have increased flexibility as every thread
and LCPU can have one and have private data stored in there than may
be accessed anytime, dependency free
Thus, this commit does the following:
1. Move/rename aforementioned structured to libcontext and document them
- lib/syscall_shim/arch/x86_64/sysregs.c -> arch/x86/sysctx.c
- lib/syscall_shim/arch/x86_64include/arch/sysregs.h -> arch/x86/x86_64include/uk/asm/sysctx.h
- s/struct ukarch_sysregs/struct ukarch_sysctx/ (and all related defs)
- struct uk_syscall_ctx from lib/syscall_shim/include/uk/syscall.h to
include/uk/arch/ctx.h as struct ukarch_execenv
- s/struct uk_syscall_ctx/struct ukarch_execenv/ (and all related defs)
- actually comment these functions
- re-adjust all places that make use of such definitions
2. Get rid of the `swapgs`, architecture specific holdback by exploiting
the flexibility of auxiliary stacks through the introduction of a new
always existing contrl block at their top end:
- introduce `struct ukarch_auxspcb` under libcontext
- add Unikraft system context as field to it so that we always have and
know Unikraft TLS (and LCPU in case ox x86_64) in a dependency free
and assumption free manner
- add a current frame pointer field: since the auxspscb will be part of
the auxiliary stack, we need to know the safe place where we can start
using the auxiliary stack area as a stack (this is also helpful in cases
where we need to nest on the auxstack)
-for the aforementione fields/structs, init/getter/setter functions have
been added and documented
- now the `swapgs` pair will only be done very early during system call
entry (and only there, not on clone child exit anymore either) just
enough so that we, first things first, switch to auxstack and push auxsp
so that on entry to C handler we will know that we must do a call to
`ukarch_sysctx_load` on the Unikraft sysctx we can get from the pushed
auxsp (another benefit of this is we get rid of MSR read/writes)
IMPORTANT NOTE: Additionally, some minor fixes have been made:
- Do not switch stack pointer to execenv pointer (previously
known as uk_syscall_ctx) during execenv loading as this implies that
functions such as `ukarch_ectx_load` or `ukarch_sysctx_load` would reuse
the space after the execenv as stack. While this is safe if the
execenv was passed through the stack, is definetely not safe if it was
passed through something like a heap buffer that may be bounded to the
execenv size by the caller. Instead, use one of the callee-saved
registers
- Set IRQ flag of the pushed flags of the caller during system call
early assembly entry (both native and binary for both architectures)
so that we don't have to explicitly set it during something like clone
child creation. This also reflects the reality better as no syscall
caller will have IRQ's disabled.
- Do not use spsr_el1, esr_el1 and elr_el1 during native system call
assembly prologue (UK_SYSCALL_EXECENV_PROLOGUE_DEFINE) on Arm, as they
are invalid because there is no actual SVC/exception happening. Instead,
try to emulate it by manually building sane values for them on the
created execenv to replicate an actual SVC while benefitting from not
dealing with the performance impacting flow of actually taking a SVC.
Andrei Tatar [Wed, 29 May 2024 12:13:23 +0000 (14:13 +0200)]
lib/ukfile: Ensure finalizers run after destructor
This change ensures that file finalizers are executed after the main
file destructor when the last strong reference to a file is released.
Finalizers may themselves release weak references, which in turn may
trigger the file destructor. Previously this could lead to destructors
being called in the wrong order.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Reviewed-by: Eduard Vintilă <eduard.vintila47@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1419
Andrei Tatar [Wed, 17 Apr 2024 17:12:02 +0000 (19:12 +0200)]
lib/posix-fdio: Add bincompat support for RWF_*
This change adds values for RWF_* flags in posix-fdio, allowing it to
interpret their meaning even without support from our (no)libc.
This enhances binary compatibility.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Mihnea Firoiu <mihneafiroiu0@gmail.com> Reviewed-by: Robert Zamfir <georobi.016@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1394
Andrei Tatar [Tue, 7 May 2024 16:33:53 +0000 (18:33 +0200)]
lib/posix-fdio: Allow owner/group == -1 for fchown
This change adds support in fchown for the owner or group to be passed
as -1, in which case that particular field is left unchanged.
This mimimcs the behavior of Linux.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1416
Andrei Tatar [Thu, 1 Feb 2024 17:05:43 +0000 (18:05 +0100)]
lib/posix-tty: Add core tty ioctls to serial files
This change adds support for essential tty-specific ioctl commands to
the serial file implementation of `ctl`. These operations are either
no-ops or return a sensible description of the properties of the serial
file.
Checkpatch-Ignore: ENOSYS Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1226
Andrei Tatar [Tue, 16 Jan 2024 14:44:19 +0000 (15:44 +0100)]
lib/posix-tty: Add stat support to tty files
This change adds support to tty files for the stat family of syscalls.
Returned values are a subset of what Linux provides, missing extended
attributes as well as timestamps. Where applicable, values match those
returned by Linux.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1226
Andrei Tatar [Thu, 21 Dec 2023 20:32:43 +0000 (22:32 +0200)]
lib/*: Move stdio out of vfscore into posix-tty
This change moves stdio initialization from vfscore into posix-tty,
replacing the legacy stdin/out/err files with newvfs versions.
In addition, this move allows differing file types, either pseudofiles
or serial console, to be assigned independently to stdin and stdout/err.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1226
Andrei Tatar [Thu, 21 Dec 2023 20:20:20 +0000 (22:20 +0200)]
lib/posix-tty: Introduce posix-tty library
This change introduces the posix-tty library, tasked with implementing
newvfs files for use as standard in/out/err.
The initial implementation provides drivers for pseudo-files (null,
void, and zero) as well as platform-specific serial console, akin to the
stdio submodule of legacy vfscore.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Marco Schlumpp <marco@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1226
Sergiu Moga [Sun, 18 Feb 2024 09:22:10 +0000 (11:22 +0200)]
plat/kvm/x86: Add early COM1 init/print for CPU init errors
Usually early boot failures tend to be very confusing since there
is no message printed. To ease figuring out what went wrong, implement
a very basic early initialization macro for the COM1 port as well as
a corresponding printing macro that can be used before having a stack.
As a first use case of these newly added macros, print an error message
when failing early CPU features initialization, right before halting the
system.
Andrei Tatar [Thu, 22 Feb 2024 18:08:56 +0000 (19:08 +0100)]
lib/posix-socket: Expose internal socket syscalls
This change exposes Unikraft-internal syscalls that create sockets.
Both versions returning raw uk_files as well as opened file descriptors
are provided.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Mihnea Firoiu <mihneafiroiu0@gmail.com> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1337
Andrei Tatar [Wed, 22 May 2024 12:09:40 +0000 (14:09 +0200)]
lib/posix-unixsocket: Add address sendmsg support
This change adds support for specifying a destination address in a
`sendmsg` call to a connection-free unix socket. The address is looked
up the same as would be done for `connect`.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1421
Andrei Tatar [Wed, 22 May 2024 12:05:25 +0000 (14:05 +0200)]
lib/posix-unixsocket: Fix mismatched locks
This change fixes a lock/unlock pair with mismatched files in `sendmsg`,
probably introduced by a typo, preventing both crashes and inconsistent
lock state.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Delia Pavel <delia_maria.pavel@stud.acs.upb.ro> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1421
Andrei Tatar [Thu, 22 Feb 2024 17:12:54 +0000 (18:12 +0100)]
lib/posix-timerfd: Replace time syscalls
This change replaces the use of userspace time syscalls in posix-timerfd
with calls to Unikraft-internal syscalls, eliminating an undeclared
dependency on syscall-shim.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Mihnea Firoiu <mihneafiroiu0@gmail.com> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1336
Andrei Tatar [Thu, 22 Feb 2024 17:07:21 +0000 (18:07 +0100)]
lib/posix-time: Add internal syscall interface
This change adds Unikraft-internal syscalls (uk_sys_*) to posix-time,
allowing the use of time functions without either a libc or
syscall-shim selected.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Mihnea Firoiu <mihneafiroiu0@gmail.com> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1336
As agreed among the Unikraft maintainers, this commit removes the Linux
userspace platform target (incuding the tap netdev driver), that was
originally intended for debugging purposes. As there are ongoing efforts
in the Unikraft community to drastically improve the debugging experience
on all hypervisor platforms, there is no good reason to keep the
maintenance effort for the linuxu platform.
This platform already had a large backlog of features.
Signed-off-by: Simon Kuenzer <simon@unikraft.io> Reviewed-by: Marco Schlumpp <marco@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1422
Andrei Tatar [Wed, 17 Apr 2024 13:04:49 +0000 (15:04 +0200)]
lib/posix-fdio: Support VA args for all fcntl cmds
This change adds support to the `fcntl` libc wrapper for fetching the
optional argument for all known fcntl cmd values.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Robert Zamfir <georobi.016@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Simon Kuenzer <simon@unikraft.io>
GitHub-Closes: #1392
Andrei Tatar [Wed, 17 Apr 2024 12:44:38 +0000 (14:44 +0200)]
lib/posix-fdio: Clean up pread/pwrite aliasing
This change reworks the libc function aliasing for pread(64) and
pwrite(64), simplifying it.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Robert Zamfir <georobi.016@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Simon Kuenzer <simon@unikraft.io>
GitHub-Closes: #1392
Andrei Tatar [Wed, 17 Apr 2024 12:20:10 +0000 (14:20 +0200)]
lib/posix-fdio: Move over libc funcs from vfscore
This change moves the implementations of non-trivial libc wrapper
functions for file-related syscalls from vfscore into posix-fdio, where
these syscalls are actually implemented.
This was an oversight of the original posix-fdio work.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Robert Zamfir <georobi.016@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Simon Kuenzer <simon@unikraft.io>
GitHub-Closes: #1392
Andrei Tatar [Mon, 22 Jan 2024 14:37:30 +0000 (15:37 +0100)]
lib/posix-unixsocket: Add warning for 0-len dgrams
This change adds a warning to the send operation of unixsockets when a
datagram of zero length is attempted to be sent.
Currently unixsockets do not support 0-length datagrams and will
otherwise silently drop these packets. This is due to internal
implementation details that can be addressed when (and if) 0-length
unixsocket datagrams are relied on by workloads.
The warning then serves as a compatibility reminder in misbehaving apps.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1272
Andrei Tatar [Thu, 22 Feb 2024 12:55:39 +0000 (13:55 +0100)]
lib/posix-pipe: Implement O_DIRECT (packet) pipes
This change replaces the implementation of pipe buffers with one that
supports both stream- and packet-mode communication, selected by using
the O_DIRECT flag.
Previously, this had been stubbed as an internal property of pipes, a
design that breaks the separation between files and open file
descriptions. This is now corrected, updating the API and its sole
consumer, posix-unixsocket.
Andrei Tatar [Thu, 7 Mar 2024 20:28:19 +0000 (21:28 +0100)]
include/uk: Add raw key RB trees in tree.h
This change adds the possibility to generate a RB tree that performs
lookups using raw keys instead of full-fledged tree nodes for
comparisons. This is achieved by providing both cmp and key functions.
- key(node) -> key_type
- cmp(key_type, key_type) -> int
The API is left compatible with the old approach using an implicit
identity key function, and where key_type is the same as node.
Checkpatch ignores to maintain consistent style within the file.
Marco Schlumpp [Mon, 24 Apr 2023 15:26:10 +0000 (17:26 +0200)]
include/uk: Add splay/RB tree implementation from FreeBSD
These can be used to implement ordered collections of structures.
Taken from FreeBSD 13.3.0 with the following modifications:
- Unikraft header guard format (__UK_TREE_H__)
- replaced FreeBSD types with Unikraft-internal
- prefixed all macros with UK_
Checkpatch ignores to leave code as close to upstream as possible.
Andrei Tatar [Wed, 7 Feb 2024 15:28:56 +0000 (16:28 +0100)]
lib/posix-unixsocket: Add basic *sockopt support
This change adds getsockopt/setsockopt support for basic socket options
from the SOL_SOCKET family. There are two main types of options added:
- Read-only opts about socket state (e.g. SO_ACCEPTCONN)
- No-op opts for benign unsupported features
Checkpatch-Ignore: ENOSYS Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Eduard Vintilă <eduard.vintila47@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1314
Andrei Tatar [Thu, 8 Feb 2024 16:44:13 +0000 (17:44 +0100)]
lib/posix-poll: Add option to yield on wait
This change adds a Kconfig option, LIBPOSIX_POLL_YIELD, that when set
ensures that execution is yielded at the beginning of every call to
epoll_wait (as well as select and poll).
This can aid compatibility with apps that assume a starvation-free
scheduler.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Mihnea Firoiu <mihneafiroiu0@gmail.com> Reviewed-by: Radu Nichita <radunichita99@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1319
Andrei Tatar [Wed, 17 Apr 2024 17:25:23 +0000 (19:25 +0200)]
lib/ukfile: Add utility inlines for iovec I/O
This change adds a utility header providing convenience inlines for
doing I/O on buffers described by struct iovec, namely:
- zero out
- scatter data from buffer to iov
- gather data from iov into buffer
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Eduard Vintilă <eduard.vintila47@gmail.com> Approved-by: Marco Schlumpp <marco@unikraft.io>
GitHub-Closes: #1396
Andrei Tatar [Fri, 23 Feb 2024 13:55:38 +0000 (14:55 +0100)]
lib/ukfile: Add opt-in support for file finalizers
This change adds optional support for file finalizers -- custom
functions registered to run when the last strong reference to a file is
released. These can be useful for e.g., automatically removing a closed
file from a polling pool.
Since this feature adds some overhead and may not be always required, it
is gated behind the LIBUKFILE_FINALIZERS config option. With this option
in its default disabled state, behavior and mem usage is as before.
This commit changes the driver API of ukfile, specifically its refcount
initializers. Affected consumers of ukfile have also been patched.
Checkpatch-Ignore: MACRO_ARG_REUSE
Checkpatch-Ignore: TRAILING_STATEMENTS Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Maria Pana <maria.pana4@gmail.com> Reviewed-by: Stefan Jumarea <stefanjumarea02@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1341
Andrei Tatar [Thu, 14 Mar 2024 15:44:48 +0000 (16:44 +0100)]
lib/ukcpio: Overwrite destination if exists
This change adds logic to CPIO extraction that attempts to remove or
rename an existing destination path. Specifically:
- regular files and symlinks will be unlinked
- empty directories will be removed
- non-empty directories will be renamed to NAME.0
When extracting a directory on top of an existing directory, the latter
is not removed or renamed, and only has mode bits adjusted.
We choose renaming over recursive deletion because:
- replacing directories with other files should ideally be a rare event
- recursive deletion, while storage efficient, is nontrivial and costly
to perform; renaming OTOH is fast but wasteful
- we value boot latency in Unikraft, thus picking rename
- this tradeoff should be revisited if/when either (1) we have efficient
recursive directory removal or (2) we value storage footprint over
boot latency
Andrei Tatar [Thu, 14 Mar 2024 15:38:47 +0000 (16:38 +0100)]
lib/ukcpio: Remove special handling of "."
This change removes the special handling of "." on cpio extraction, as
it introduced an unnecessary strcmp on every path, prevented mode bits
from being applied on the destination root, and produced a warning at
runtime.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> Reviewed-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Eduard Vintilă <eduard.vintila47@gmail.com>
GitHub-Closes: #1362
Andrei Tatar [Wed, 17 Apr 2024 16:13:23 +0000 (18:13 +0200)]
lib/{posix-*,ukfile}: Add ukatomic dependency
This change adds a Kconfig dependency to ukatomic on several libraries
that were written before the ukatomic split-off. This makes their using
atomic operations explicit.
Signed-off-by: Andrei Tatar <andrei@unikraft.io> Reviewed-by: Robert Zamfir <georobi.016@gmail.com> Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1393