Palmer Dabbelt [Wed, 20 Mar 2024 00:09:05 +0000 (17:09 -0700)]
Merge patch series "riscv: Introduce compat-mode helpers & improve arch_get_mmap_end()"
Leonardo Bras <leobras@redhat.com> says:
I just saw the opportunity of optimizing the helper is_compat_task() by
introducing a compile-time test, and it made possible to remove some
#ifdef's without any loss of performance.
I also saw the possibility of removing the direct check of task flags from
general code, and concentrated it in asm/compat.h by creating a few more
helpers, which in the end helped optimize code.
arch_get_mmap_end() just got a simple improvement and some extra docs.
* b4-shazam-merge:
riscv: Introduce set_compat_task() in asm/compat.h
riscv: Introduce is_compat_thread() into compat.h
riscv: add compile-time test into is_compat_task()
riscv: Replace direct thread flag check with is_compat_task()
riscv: Improve arch_get_mmap_end() macro
Sunil V L [Thu, 18 Jan 2024 06:29:30 +0000 (11:59 +0530)]
ACPI: Enable ACPI_PROCESSOR for RISC-V
The ACPI processor driver is not currently enabled for RISC-V.
This is required to enable CPU related functionalities like
LPI and CPPC. Hence, enable ACPI_PROCESSOR for RISC-V.
Sunil V L [Thu, 18 Jan 2024 06:29:28 +0000 (11:59 +0530)]
cpuidle: RISC-V: Move few functions to arch/riscv
To support ACPI Low Power Idle (LPI), few functions are required which
are currently static functions in the DT based cpuidle driver. Hence,
move them under arch/riscv so that ACPI driver also can use them. Since
they are no longer static functions, append "riscv_" prefix to the
function name.
Leonardo Bras [Wed, 3 Jan 2024 16:00:21 +0000 (13:00 -0300)]
riscv: add compile-time test into is_compat_task()
Currently several places will test for CONFIG_COMPAT before testing
is_compat_task(), probably in order to avoid a run-time test into the task
structure.
Since is_compat_task() is an inlined function, it would be helpful to add a
compile-time test of CONFIG_COMPAT, making sure it always returns zero when
the option is not enabled during the kernel build.
With this, the compiler is able to understand in build-time that
is_compat_task() will always return 0, and optimize-out some of the extra
code introduced by the option.
This will also allow removing a lot #ifdefs that were introduced, and make
the code more clean.
Leonardo Bras [Wed, 3 Jan 2024 16:00:19 +0000 (13:00 -0300)]
riscv: Improve arch_get_mmap_end() macro
This macro caused me some confusion, which took some reviewer's time to
make it clear, so I propose adding a short comment in code to avoid
confusion in the future.
Also, added some improvements to the macro, such as removing the
assumption of VA_USER_SV57 being the largest address space.
Palmer Dabbelt [Fri, 15 Mar 2024 17:17:34 +0000 (10:17 -0700)]
Merge patch series "riscv: mm: Extend mappable memory up to hint address"
Charlie Jenkins <charlie@rivosinc.com> says:
On riscv, mmap currently returns an address from the largest address
space that can fit entirely inside of the hint address. This makes it
such that the hint address is almost never returned. This patch raises
the mappable area up to and including the hint address. This allows mmap
to often return the hint address, which allows a performance improvement
over searching for a valid address as well as making the behavior more
similar to other architectures.
Note that a previous patch introduced stronger semantics compared to
other architectures for riscv mmap. On riscv, mmap will not use bits in
the upper bits of the virtual address depending on the hint address. On
other architectures, a random address is returned in the address space
requested. On all architectures the hint address will be returned if it
is available. This allows riscv applications to configure how many bits
in the virtual address should be left empty. This has the two benefits
of being able to request address spaces that are smaller than the
default and doesn't require the application to know the page table
layout of riscv.
* b4-shazam-merge:
docs: riscv: Define behavior of mmap
selftests: riscv: Generalize mm selftests
riscv: mm: Use hint address in mmap if available
Palmer Dabbelt [Wed, 13 Mar 2024 14:30:33 +0000 (07:30 -0700)]
Merge patch series "riscv: Use Kconfig to set unaligned access speed"
Charlie Jenkins <charlie@rivosinc.com> says:
If the hardware unaligned access speed is known at compile time, it is
possible to avoid running the unaligned access speed probe to speedup
boot-time.
* b4-shazam-merge:
riscv: Set unaligned access speed at compile time
riscv: Decouple emulated unaligned accesses from access speed
riscv: Only check online cpus for emulated accesses
riscv: lib: Introduce has_fast_unaligned_access()
Palmer Dabbelt [Tue, 12 Mar 2024 14:13:21 +0000 (07:13 -0700)]
Merge patch series "Support Andes PMU extension"
Yu Chien Peter Lin <peterlin@andestech.com> says:
This patch series introduces the Andes PMU extension, which serves the
same purpose as Sscofpmf and Smcntrpmf. Its non-standard local interrupt
is assigned to bit 18 in the custom S-mode local interrupt enable and
pending registers (slie/slip), while the interrupt cause is (256 + 18).
* b4-shazam-merge:
riscv: andes: Support specifying symbolic firmware and hardware raw events
riscv: dts: renesas: Add Andes PMU extension for r9a07g043f
dt-bindings: riscv: Add Andes PMU extension description
perf: RISC-V: Introduce Andes PMU to support perf event sampling
perf: RISC-V: Eliminate redundant interrupt enable/disable operations
riscv: dts: renesas: r9a07g043f: Update compatible string to use Andes INTC
dt-bindings: riscv: Add Andes interrupt controller compatible string
riscv: errata: Rename defines for Andes
Palmer Dabbelt [Fri, 15 Mar 2024 16:27:20 +0000 (09:27 -0700)]
Merge patch "riscv: Fix compilation error with FAST_GUP and rv32"
I'm picking this up on top of the broken patch for the merge window, as
the offending patch breaks the rv32 build and was itself a fix so isn't
on for-next.
* b4-shazam-merge:
riscv: Fix compilation error with FAST_GUP and rv32
riscv: Fix pte_leaf_size() for NAPOT
Revert "riscv: mm: support Svnapot in huge vmap"
Charlie Jenkins [Wed, 31 Jan 2024 01:07:01 +0000 (17:07 -0800)]
selftests: riscv: Generalize mm selftests
The behavior of mmap on riscv is defined to not provide an address that
uses more bits than the hint address, if provided. Make the tests
reflect that.
Charlie Jenkins [Wed, 31 Jan 2024 01:07:00 +0000 (17:07 -0800)]
riscv: mm: Use hint address in mmap if available
On riscv it is guaranteed that the address returned by mmap is less than
the hint address. Allow mmap to return an address all the way up to
addr, if provided, rather than just up to the lower address space.
This provides a performance benefit as well, allowing mmap to exit after
checking that the address is in range rather than searching for a valid
address.
It is possible to provide an address that uses at most the same number
of bits, however it is significantly more computationally expensive to
provide that number rather than setting the max to be the hint address.
There is the instruction clz/clzw in Zbb that returns the highest set bit
which could be used to performantly implement this, but it would still
be slower than the current implementation. At worst case, half of the
address would not be able to be allocated when a hint address is
provided.
Charlie Jenkins [Fri, 8 Mar 2024 18:25:58 +0000 (10:25 -0800)]
riscv: Set unaligned access speed at compile time
Introduce Kconfig options to set the kernel unaligned access support.
These options provide a non-portable alternative to the runtime
unaligned access probe.
To support this, the unaligned access probing code is moved into it's
own file and gated behind a new RISCV_PROBE_UNALIGNED_ACCESS_SUPPORT
option.
Charlie Jenkins [Fri, 8 Mar 2024 18:25:57 +0000 (10:25 -0800)]
riscv: Decouple emulated unaligned accesses from access speed
Detecting if a system traps into the kernel on an unaligned access
can be performed separately from checking the speed of unaligned
accesses. This decoupling will make it possible to selectively enable
or disable each of these checks.
riscv: dts: renesas: Add Andes PMU extension for r9a07g043f
xandespmu stands for Andes Performance Monitor Unit extension.
Based on the added Andes PMU ISA string, the SBI PMU driver
will make use of the non-standard irq source.
perf: RISC-V: Introduce Andes PMU to support perf event sampling
Assign riscv_pmu_irq_num the value of (256 + 18) for the custome PMU
and add SSCOUNTOVF and SIP alternatives to ALT_SBI_PMU_OVERFLOW()
and ALT_SBI_PMU_OVF_CLEAR_PENDING() macros, respectively.
To make use of Andes PMU extension, "xandespmu" needs to be appended
to the riscv,isa-extensions for each cpu node in device-tree, and
make sure CONFIG_ANDES_CUSTOM_PMU is enabled.
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com> Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com> Co-developed-by: Locus Wei-Han Chen <locus84@andestech.com> Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-8-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The interrupt enable/disable operations are already performed by the
IRQ chip functions riscv_intc_irq_unmask()/riscv_intc_irq_mask() during
enable_percpu_irq()/disable_percpu_irq(). It can be done only once.
riscv: dts: renesas: r9a07g043f: Update compatible string to use Andes INTC
The Andes hart-level interrupt controller (Andes INTC) allows AX45MP
cores to handle custom local interrupts, such as the performance
counter overflow interrupt.
dt-bindings: riscv: Add Andes interrupt controller compatible string
Add "andestech,cpu-intc" compatible string to indicate that
Andes specific local interrupt is supported on the core,
e.g. AX45MP cores have 3 types of non-standard local interrupt
which can be handled in supervisor mode:
- Slave port ECC error interrupt
- Bus write transaction error interrupt
- Performance monitor overflow interrupt
These interrupts are enabled/disabled via a custom register
SLIE instead of the standard interrupt enable register SIE.
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-5-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Tue, 12 Mar 2024 14:10:08 +0000 (07:10 -0700)]
Merge tag 'irq-for-riscv-02-23-24' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-next
INTC changes to consume for RISCV
* tag 'irq-for-riscv-02-23-24' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/riscv-intc: Introduce Andes hart-level interrupt controller
irqchip/riscv-intc: Allow large non-standard interrupt number
We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if
some part of a NAPOT mapping is unmapped, the remaining mapping is not
updated accordingly. For example:
0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V
Meaning the first entry which was not unmapped still has the N bit set,
which, if accessed first and cached in the TLB, could allow access to the
unmapped range.
That's because the logic to break the NAPOT mapping does not exist and
likely won't. Indeed, to break a NAPOT mapping, we first have to clear
the whole mapping, flush the TLB and then set the new mapping ("break-
before-make" equivalent). That works fine in userspace since we can handle
any pagefault occurring on the remaining mapping but we can't handle a kernel
pagefault on such mapping.
So fix this by reverting the commit that introduced the vmap/vmalloc
support.
Eric Biggers [Sat, 27 Jan 2024 09:00:54 +0000 (01:00 -0800)]
RISC-V: fix check for zvkb with tip-of-tree clang
LLVM commit 8e01042da9d3 ("[RISCV] Add missing dependency check for Zvkb
(#79467)") broke the check used by the TOOLCHAIN_HAS_VECTOR_CRYPTO
kconfig symbol because it made zvkb start depending on v or zve*. Fix
this by specifying both v and zvkb when checking for support for zvkb.
irqchip/riscv-intc: Introduce Andes hart-level interrupt controller
Add support for the Andes hart-level interrupt controller. This
controller provides interrupt mask/unmask functions to access the
custom register (SLIE) where the non-standard S-mode local interrupt
enable bits are located. The base of custom interrupt number is set
to 256.
To share the riscv_intc_domain_map() with the generic RISC-V INTC and
ACPI, add a chip parameter to riscv_intc_init_common(), so it can be
passed to the irq_domain_set_info() as a private data.
Andes hart-level interrupt controller requires the "andestech,cpu-intc"
compatible string to be present in interrupt-controller of cpu node to
enable the use of custom local interrupt source.
e.g.,
irqchip/riscv-intc: Allow large non-standard interrupt number
Currently, the implementation of the RISC-V INTC driver uses the
interrupt cause as the hardware interrupt number, with a maximum of
64 interrupts. However, the platform can expand the interrupt number
further for custom local interrupts.
To fully utilize the available local interrupt sources, switch
to using irq_domain_create_tree() that creates the radix tree
map, add global variables (riscv_intc_nr_irqs, riscv_intc_custom_base
and riscv_intc_custom_nr_irqs) to determine the valid range of local
interrupt number (hwirq).
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Randolph <randolph@andestech.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240222083946.3977135-3-peterlin@andestech.com
RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH
Commit e4bb020f3dbb ("riscv: detect assembler support for .option arch")
added two tests, one for a valid value to '.option arch' that should
succeed and one for an invalid value that is expected to fail to make
sure that support for '.option arch' is properly detected because Clang
does not error when '.option arch' is not supported:
Unfortunately, the invalid test started being accepted by Clang after
the linked llvm-project change, which causes CONFIG_AS_HAS_OPTION_ARCH
and configurations that depend on it to be silently disabled, even
though those versions do support '.option arch'.
The invalid test can be avoided altogether by using
'-Wa,--fatal-warnings', which will turn all assembler warnings into
errors, like '-Werror' does for the compiler:
The as-instr macros have been updated to make use of this flag, so
remove the invalid test, which allows CONFIG_AS_HAS_OPTION_ARCH to work
for all compiler versions.
Cc: stable@vger.kernel.org Fixes: e4bb020f3dbb ("riscv: detect assembler support for .option arch") Link: https://github.com/llvm/llvm-project/commit/3ac9fe69f70a2b3541266daedbaaa7dc9c007a2a Reported-by: Eric Biggers <ebiggers@kernel.org> Closes: https://lore.kernel.org/r/20240121011341.GA97368@sol.localdomain/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Eric Biggers <ebiggers@google.com> Tested-by: Andy Chiu <andybnac@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240125-fix-riscv-option-arch-llvm-18-v1-2-390ac9cc3cd0@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
kbuild: Add -Wa,--fatal-warnings to as-instr invocation
Certain assembler instruction tests may only induce warnings from the
assembler on an unsupported instruction or option, which causes as-instr
to succeed when it was expected to fail. Some tests workaround this
limitation by additionally testing that invalid input fails as expected.
However, this is fragile if the assembler is changed to accept the
invalid input, as it will cause the instruction/option to be unavailable
like it was unsupported even when it is.
Use '-Wa,--fatal-warnings' in the as-instr macro to turn these warnings
into hard errors, which avoids this fragility and makes tests more
robust and well formed.
Cc: stable@vger.kernel.org Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Eric Biggers <ebiggers@google.com> Tested-by: Andy Chiu <andybnac@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240125-fix-riscv-option-arch-llvm-18-v1-1-390ac9cc3cd0@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Drew Fustini [Wed, 6 Dec 2023 08:09:21 +0000 (00:09 -0800)]
riscv: defconfig: Enable mmc and dma drivers for T-Head TH1520
Enable the mmc controller driver and dma controller driver needed for
T-Head TH1520 based boards, like the LicheePi 4A and BeagleV-Ahead, to
boot from eMMC storage.
Palmer Dabbelt [Thu, 15 Feb 2024 16:04:23 +0000 (08:04 -0800)]
Merge patch series "membarrier: riscv: Core serializing command"
RISC-V was lacking a membarrier implementation for the store/fetch
ordering, which is a bit tricky because of the deferred icache flushing
we use in RISC-V.
* b4-shazam-merge:
membarrier: riscv: Provide core serializing command
locking: Introduce prepare_sync_core_cmd()
membarrier: Create Documentation/scheduler/membarrier.rst
membarrier: riscv: Add full memory barrier in switch_mm()
Andrea Parri [Wed, 31 Jan 2024 14:49:36 +0000 (15:49 +0100)]
membarrier: riscv: Provide core serializing command
RISC-V uses xRET instructions on return from interrupt and to go back
to user-space; the xRET instruction is not core serializing.
Use FENCE.I for providing core serialization as follows:
- by calling sync_core_before_usermode() on return from interrupt (cf.
ipi_sync_core()),
- via switch_mm() and sync_core_before_usermode() (respectively, for
uthread->uthread and kthread->uthread transitions) before returning
to user-space.
On RISC-V, the serialization in switch_mm() is activated by resetting
the icache_stale_mask of the mm at prepare_sync_core_cmd().
To gather the architecture requirements of the "private/global
expedited" membarrier commands. The file will be expanded to
integrate further information about the membarrier syscall (as
needed/desired in the future). While at it, amend some related
inline comments in the membarrier codebase.
Andrea Parri [Wed, 31 Jan 2024 14:49:33 +0000 (15:49 +0100)]
membarrier: riscv: Add full memory barrier in switch_mm()
The membarrier system call requires a full memory barrier after storing
to rq->curr, before going back to user-space. The barrier is only
needed when switching between processes: the barrier is implied by
mmdrop() when switching from kernel to userspace, and it's not needed
when switching from userspace to kernel.
Rely on the feature/mechanism ARCH_HAS_MEMBARRIER_CALLBACKS and on the
primitive membarrier_arch_switch_mm(), already adopted by the PowerPC
architecture, to insert the required barrier.
Xiao Wang [Sun, 12 Nov 2023 09:44:21 +0000 (17:44 +0800)]
riscv: Avoid code duplication with generic bitops implementation
There's code duplication between the fallback implementation for bitops
__ffs/__fls/ffs/fls API and the generic C implementation in
include/asm-generic/bitops/. To avoid this duplication, this patch renames
the generic C implementation by adding a "generic_" prefix to them, then we
can use these generic APIs as fallback.
Add support of kernel stack offset randomization while handling syscall,
the offset is defaultly limited by KSTACK_OFFSET_MAX() (i.e. 10 bits).
In order to avoid trigger stack canaries (due to __builtin_alloca) and
slowing down the entry path, use __no_stack_protector attribute to
disable stack protector for do_trap_ecall_u() at the function level.
Adding kprobes on some assembly functions (mainly exception handling)
will result in crashes (either recursive trap or panic). To avoid such
errors, add ASM_NOKPROBE() macro which allow adding specific symbols
into the __kprobe_blacklist section and use to blacklist the following
symbols that showed to be problematic:
- handle_exception()
- ret_from_exception()
- handle_kernel_stack_overflow()
Palmer Dabbelt [Wed, 24 Jan 2024 23:57:00 +0000 (15:57 -0800)]
Merge patch series "riscv: support fast gup"
Jisheng Zhang <jszhang@kernel.org> says:
This series adds fast gup support to riscv.
The First patch fixes a bug in __p*d_free_tlb(). Per the riscv
privileged spec, if non-leaf PTEs I.E pmd, pud or p4d is modified, a
sfence.vma is a must.
The 2nd patch is a preparation patch.
The last two patches do the real work:
In order to implement fast gup we need to ensure that the page
table walker is protected from page table pages being freed from
under it.
riscv situation is more complicated than other architectures: some
riscv platforms may use IPI to perform TLB shootdown, for example,
those platforms which support AIA, usually the riscv_ipi_for_rfence is
true on these platforms; some riscv platforms may rely on the SBI to
perform TLB shootdown, usually the riscv_ipi_for_rfence is false on
these platforms. To keep software pagetable walkers safe in this case
we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the
comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in
include/asm-generic/tlb.h for more details.
This patch enables MMU_GATHER_RCU_TABLE_FREE, then use
*tlb_remove_page_ptdesc() for those platforms which use IPI to perform
TLB shootdown;
*tlb_remove_ptdesc() for those platforms which use SBI to perform TLB
shootdown;
Both case mean that disabling interrupts will block the free and
protect the fast gup page walker.
So after the 3rd patch, everything is well prepared, let's select
HAVE_FAST_GUP if MMU.
* b4-shazam-merge:
riscv: enable HAVE_FAST_GUP if MMU
riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU
riscv: tlb: convert __p*d_free_tlb() to inline functions
riscv: tlb: fix __p*d_free_tlb()
Jisheng Zhang [Tue, 19 Dec 2023 17:50:45 +0000 (01:50 +0800)]
riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU
In order to implement fast gup we need to ensure that the page
table walker is protected from page table pages being freed from
under it.
riscv situation is more complicated than other architectures: some
riscv platforms may use IPI to perform TLB shootdown, for example,
those platforms which support AIA, usually the riscv_ipi_for_rfence is
true on these platforms; some riscv platforms may rely on the SBI to
perform TLB shootdown, usually the riscv_ipi_for_rfence is false on
these platforms. To keep software pagetable walkers safe in this case
we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the
comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in
include/asm-generic/tlb.h for more details.
This patch enables MMU_GATHER_RCU_TABLE_FREE, then use
*tlb_remove_page_ptdesc() for those platforms which use IPI to perform
TLB shootdown;
*tlb_remove_ptdesc() for those platforms which use SBI to perform TLB
shootdown;
Both case mean that disabling interrupts will block the free and
protect the fast gup page walker.
Jisheng Zhang [Tue, 19 Dec 2023 17:50:43 +0000 (01:50 +0800)]
riscv: tlb: fix __p*d_free_tlb()
If non-leaf PTEs I.E pmd, pud or p4d is modified, a sfence.vma is
a must for safe, imagine if an implementation caches the non-leaf
translation in TLB, although I didn't meet this HW so far, but it's
possible in theory.
Palmer Dabbelt [Wed, 24 Jan 2024 15:07:45 +0000 (07:07 -0800)]
Merge patch series "riscv: Increase mmap_rnd_bits_max on Sv48/57"
Sami Tolvanen <samitolvanen@google.com> says:
We noticed that 64-bit RISC-V kernels limit mmap_rnd_bits to 24
even if the hardware supports a larger virtual address space size
[1]. These two patches allow mmap_rnd_bits_max to be changed during
init, and bumps up the maximum randomness if we end up setting up
4/5-level paging at boot.
Sami Tolvanen [Fri, 29 Sep 2023 21:11:58 +0000 (21:11 +0000)]
riscv: mm: Update mmap_rnd_bits_max
ARCH_MMAP_RND_BITS_MAX is based on Sv39, which leaves a few
potential bits of mmap randomness on the table if we end up enabling
4/5-level paging. Update mmap_rnd_bits_max to take the final address
space size into account. This increases mmap_rnd_bits_max from 24 to
33 with Sv48/57.
Sami Tolvanen [Fri, 29 Sep 2023 21:11:57 +0000 (21:11 +0000)]
mm: Change mmap_rnd_bits_max to __ro_after_init
Allow mmap_rnd_bits_max to be updated on architectures that
determine virtual address space size at runtime instead of relying
on Kconfig options by changing it from const to __ro_after_init.
Conor Dooley [Wed, 2 Aug 2023 11:12:53 +0000 (12:12 +0100)]
Revert "RISC-V: mark hibernation as nonportable"
Revert commit ed309ce52218 ("RISC-V: mark hibernation as nonportable")
as it appears the broken versions of OpenSBI have not made it to
production on any systems that support hibernation.
Vincent Chen [Tue, 5 Sep 2023 07:09:45 +0000 (15:09 +0800)]
clocksource: extend the max_delta_ns of timer-riscv and timer-clint to ULONG_MAX
When registering the riscv-timer or clint-timer as a clock_event device,
the driver needs to specify the value of max_delta_ticks. This value
directly influences the max_delta_ns, which represents the maximum time
interval for configuring subsequent clock events. Currently, both
riscv-timer and clint-timer are set with a max_delta_ticks value of
0x7fff_ffff. When the timer operates at a high frequency, this values
limists the system to sleep only for a short time. For the 1GHz case,
the sleep cannot exceed two seconds. To address this limitation, refer to
other timer implementations to extend it to 2^(bit-width of the timer) - 1.
Because the bit-width of $mtimecmp is 64bit, this value becomes ULONG_MAX
(0xffff_ffff_ffff_ffff).
Palmer Dabbelt [Tue, 23 Jan 2024 01:55:40 +0000 (17:55 -0800)]
Merge patch series "RISC-V crypto with reworked asm files"
Eric Biggers <ebiggers@kernel.org> says:
This patchset, which applies to v6.8-rc1, adds cryptographic algorithm
implementations accelerated using the RISC-V vector crypto extensions
(https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf)
and RISC-V vector extension
(https://github.com/riscv/riscv-v-spec/releases/download/v1.0/riscv-v-spec-1.0.pdf).
The following algorithms are included: AES in ECB, CBC, CTR, and XTS modes;
ChaCha20; GHASH; SHA-2; SM3; and SM4.
In general, the assembly code requires a 64-bit RISC-V CPU with VLEN >= 128,
little endian byte order, and vector unaligned access support. The ECB, CTR,
XTS, and ChaCha20 code is designed to naturally scale up to larger VLEN values.
Building the assembly code requires tip-of-tree binutils (future 2.42) or
tip-of-tree clang (future 18.x). All algorithms pass testing in QEMU, using
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y. Much of the assembly code is derived from
OpenSSL code that was added by https://github.com/openssl/openssl/pull/21923.
It's been cleaned up for integration with the kernel, e.g. reducing code
duplication, eliminating use of .inst and perlasm, and fixing a few bugs.
This patchset incorporates the work of multiple people, including Jerry Shih,
Heiko Stuebner, Christoph Müllner, Phoebe Chen, Charalampos Mitrodimas, and
myself. This patchset went through several versions from Heiko (last version
https://lore.kernel.org/linux-crypto/20230711153743.1970625-1-heiko@sntech.de),
then several versions from Jerry (last version:
https://lore.kernel.org/linux-crypto/20231231152743.6304-1-jerry.shih@sifive.com),
then finally several versions from me. Thanks to everyone who has contributed
to this patchset or its prerequisites.
Jerry Shih [Mon, 22 Jan 2024 00:19:21 +0000 (16:19 -0800)]
crypto: riscv - add vector crypto accelerated SM4
Add an implementation of SM4 using the Zvksed extension. The assembly
code is derived from OpenSSL code (openssl/openssl#21923) that was
dual-licensed so that it could be reused in the kernel. Nevertheless,
the assembly has been significantly reworked for integration with the
kernel, for example by using a regular .S file instead of the so-called
perlasm, using the assembler instead of bare '.inst', and greatly
reducing code duplication.
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-11-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Jerry Shih [Mon, 22 Jan 2024 00:19:20 +0000 (16:19 -0800)]
crypto: riscv - add vector crypto accelerated SM3
Add an implementation of SM3 using the Zvksh extension. The assembly
code is derived from OpenSSL code (openssl/openssl#21923) that was
dual-licensed so that it could be reused in the kernel. Nevertheless,
the assembly has been significantly reworked for integration with the
kernel, for example by using a regular .S file instead of the so-called
perlasm, using the assembler instead of bare '.inst', and greatly
reducing code duplication.
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-10-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add an implementation of SHA-512 and SHA-384 using the Zvknhb extension.
The assembly code is derived from OpenSSL code (openssl/openssl#21923)
that was dual-licensed so that it could be reused in the kernel.
Nevertheless, the assembly has been significantly reworked for
integration with the kernel, for example by using a regular .S file
instead of the so-called perlasm, using the assembler instead of bare
'.inst', and greatly reducing code duplication.
Add an implementation of SHA-256 and SHA-224 using the Zvknha or Zvknhb
extension. The assembly code is derived from OpenSSL code
(openssl/openssl#21923) that was dual-licensed so that it could be
reused in the kernel. Nevertheless, the assembly has been significantly
reworked for integration with the kernel, for example by using a regular
.S file instead of the so-called perlasm, using the assembler instead of
bare '.inst', and greatly reducing code duplication.
Add an implementation of GHASH using the zvkg extension. The assembly
code is derived from OpenSSL code (openssl/openssl#21923) that was
dual-licensed so that it could be reused in the kernel. Nevertheless,
the assembly has been significantly reworked for integration with the
kernel, for example by using a regular .S file instead of the so-called
perlasm, using the assembler instead of bare '.inst', reducing code
duplication, and eliminating unnecessary endianness conversions.
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-7-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add an implementation of ChaCha20 using the Zvkb extension. The
assembly code is derived from OpenSSL code (openssl/openssl#21923) that
was dual-licensed so that it could be reused in the kernel.
Nevertheless, the assembly has been significantly reworked for
integration with the kernel, for example by using a regular .S file
instead of the so-called perlasm, using the assembler instead of bare
'.inst', and reducing code duplication.
Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-6-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add implementations of AES-ECB, AES-CBC, AES-CTR, and AES-XTS, as well
as bare (single-block) AES, using the RISC-V vector crypto extensions.
The assembly code is derived from OpenSSL code (openssl/openssl#21923)
that was dual-licensed so that it could be reused in the kernel.
Nevertheless, the assembly has been significantly reworked for
integration with the kernel, for example by using regular .S files
instead of the so-called perlasm, using the assembler instead of bare
'.inst', greatly reducing code duplication, supporting AES-192, and
making the code use the same AES key structure as the C code.
Co-developed-by: Phoebe Chen <phoebe.chen@sifive.com> Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-5-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Heiko Stuebner [Mon, 22 Jan 2024 00:19:12 +0000 (16:19 -0800)]
RISC-V: add helper function to read the vector VLEN
VLEN describes the length of each vector register and some instructions
need specific minimal VLENs to work correctly.
The vector code already includes a variable riscv_v_vsize that contains
the value of "32 vector registers with vlenb length" that gets filled
during boot. vlenb is the value contained in the CSR_VLENB register and
the value represents "VLEN / 8".
So add riscv_vector_vlen() to return the actual VLEN value for in-kernel
users when they need to check the available VLEN.
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-2-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Wende Tan [Tue, 17 Oct 2023 22:21:04 +0000 (15:21 -0700)]
RISC-V: build: Allow LTO to be selected
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.
To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.
Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.
[1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] https://github.com/ClangBuiltLinux/linux/issues/1942
[3] https://github.com/llvm/llvm-project/issues/59350
[4] https://github.com/llvm/llvm-project/issues/65090
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"Some fixes, Some refactoring, some minor features:
- Assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on
multithreaded workloads: fstests (in addition to ktest tests) are
now checking slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
bcachefs: Improve inode_to_text()
bcachefs: logged_ops_format.h
bcachefs: reflink_format.h
bcachefs; extents_format.h
bcachefs: ec_format.h
bcachefs: subvolume_format.h
bcachefs: snapshot_format.h
bcachefs: alloc_background_format.h
bcachefs: xattr_format.h
bcachefs: dirent_format.h
bcachefs: inode_format.h
bcachefs; quota_format.h
bcachefs: sb-counters_format.h
bcachefs: counters.c -> sb-counters.c
bcachefs: comment bch_subvolume
bcachefs: bch_snapshot::btime
bcachefs: add missing __GFP_NOWARN
bcachefs: opts->compression can now also be applied in the background
bcachefs: Prep work for variable size btree node buffers
bcachefs: grab s_umount only if snapshotting
...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
clocksource/drivers/ep93xx: Fix error handling during probe
clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
clocksource/timer-riscv: Add riscv_clock_shutdown callback
dt-bindings: timer: Add StarFive JH8100 clint
dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.
And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.
Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.
Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().
Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.
Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>