Use REFCOUNT_COUNT() to obtain refcount where appropriate.
Refcount waiting will set some flag bits in the refcount value.
Make sure these bits get cleared by using the REFCOUNT_COUNT()
macro to obtain the actual refcount.
Get the readline header from the installed header instead of the from the source
location.
With newer import of libedit, the path to be able to access readline/readline.h
will also include header which name will conflict with some expected by ntp in
another path and end up breaking the build.
SIOCSIFNAME: Do nothing if we're not actually changing
Instead of throwing EEXIST, just succeed if the name isn't actually
changing. We don't need to trigger departure or any of that because there's
no change from consumers' perspective.
As I like to forget: static kenv var formatting is actually such that an
empty environment would be double null bytes. We should make sure that a
non-zero buffer has at least enough for this, though most of the current
usage is with a 4k buffer.
kenv: assert that an empty static buffer passed in is "empty"
Garbage in the passed-in buffer can cause problems if any attempts to read
the kenv are inadvertently made between init_static_kenv and the first
kern_setenv -- assuming there is one.
This is cheap and easy, so do it. This also helps rule out some class of
bugs as one tries to debug; tunables fetch from the static environment up
until SI_SUB_KMEM + 1, and many of these buffers are global ~4k buffers that
rely on BSS clearing while others just grab a page of free memory and use it
(e.g. xen).
ig4(4): Fix SDA HOLD time set too low on Skylake controllers
Execution of "Soft reset" command (IG4_REG_RESETS_SKL) at controller init
stage sets SDA_HOLD register value to 0x0001 which is often too low for
normal operation.
Set SDA_HOLD back to 28 after reset to restore controller functionality.
PR: 240339
Reported by: imp, GregV, et al.
MFC after: 3 days
cem [Wed, 11 Sep 2019 21:24:14 +0000 (21:24 +0000)]
buf: Add B_INVALONERR flag to discard data
Setting the B_INVALONERR flag before a synchronous write causes the buf
cache to forcibly invalidate contents if the write fails (BIO_ERROR).
This is intended to be used to allow layers above the buffer cache to make
more informed decisions about when discarding dirty buffers without
successful write is acceptable.
As a proof of concept, use in msdosfs to handle failures to mark the on-disk
'dirty' bit during rw mount or ro->rw update.
Extending this to other filesystems is left as future work.
getsockopt.2: clarify that SO_TIMESTAMP is not 100% reliable
When SO_TIMESTAMP is set, the kernel will attempt to attach a timestamp as
ancillary data to each IP datagram that is received on the socket. However,
it may fail, for example due to insufficient memory. In that case the
packet will still be received but not timestamp will be attached.
Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21607
fusefs: Fix iosize for FUSE_WRITE in 7.8 compat mode
When communicating with a FUSE server that implements version 7.8 (or older)
of the FUSE protocol, the FUSE_WRITE request structure is 16 bytes shorter
than normal. The protocol version check wasn't applied universally, leading
to an extra 16 bytes being sent to such servers. The extra bytes were
allocated and bzero()d, so there was no information disclosure.
Reviewed by: emaste
MFC after: 3 days
MFC-With: r350665
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21557
ping: Verify whether a datagram timestamp was actually received.
ping(8) uses SO_TIMESTAMP, which attaches a timestamp to each IP datagram at
the time it's received by the kernel. Except that occasionally it doesn't.
Add a check to see whether such a timestamp was actually set before trying
to read it. This fixes segfaults that can happen when the kernel doesn't
attach a timestamp.
The bug has always existed, but prior to r351461 it manifested as an
implausible round-trip-time, not a segfault.
Reported by: pho
MFC after: 3 days
MFC-With: 351461
Avoid unneeded call to arc4random() in syncache_add()
Don't call arc4random() unconditionally to initialize sc_iss, and
then when syncookies are enabled, just overwrite it with the
return value from from syncookie_generate(). Instead, only call
arc4random() to initialize sc_iss when syncookies are not
enabled.
Note that on a system under a syn flood attack, arc4random()
becomes quite expensive, and the chacha_poly crypto that it calls
is one of the more expensive things happening on the
system. Removing this unneeded arc4random() call reduces CPU from
about 40% to about 35% in my test scenario (Broadwell Xeon, 6Mpps
syn flood attack).
Avoid the use of the non-portable -D argument to ls.
This was used to store the mtime of the source file in a commment in a
generated header file. This is of little-to-no diagnostic value and
the result doesn't even end up in the source tree.
riscv: Small fix to CPU compatibility identification
fdt_is_compatible_strict() inspects the first compatible property.
We need to inspect the following properties for 'riscv'.
ofw_bus_node_is_compatible() does a recursive search.
This patch fixes "Can't find CPU" error message when bootverbose = true.
With the recent commit of ktls, we no longer have a
sb_tls_flags, its just the sb_flags. Also the ratelimit
code, now that the defintion is in sockbuf.h, does not
need the ktls.h file (or its predecessor).
- make abday, day, abmon, mon, am_pm output quoting match linux
- workaround localeconv() issue for mon_grouping and grouping (PR172215)
- for other values not available in default locale, output -1 instead of
127 (CHAR_MAX) as returned by localeconv()
With these changes, output of `locale` and `locale -k` for all keywords
specified by POSIX exactly matches the linux one.
fw_stub.awk: use @generated tag in generated files
Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in enerated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.
Assume all the short args have optional args so allocate space for the
':'. It's slightly wasteful, but much easier (and the savings in bytes
at runtime would be tiny, but the code to do it larger).
This command simply returns 0 at the moment and explicitly takes no
arguments. This should be used by utilities wanting to see if bectl can
operate on the system they're running, or with a specific root (`bectl -r`).
It may grow more checks than "will libbe successfully init" in the future,
but for now this is enough as that checks for the dataset mounted at "/" and
that it looks capable of being a BE root (e.g. it's not a top-level dataset)
bectl commands can now specify if they want to be silent, and this will turn
off libbe_print_on_error so they can control the output as needed. This is
already used in `bectl check`, and may be turned on in the future for some
other commands where libbe errors are better suppressed as the failure mode
may be obvious.
Readd _el_fn_sh_complete for backward compatibility
This function is not needed anymore, it allows old sh binary to continue
to run and avoid breaking backward compatibility.
Note that is now just calls the regular _el_fn_complete which does a proper
job at quoting.
ian [Tue, 10 Sep 2019 22:08:34 +0000 (22:08 +0000)]
In am335x_dmtpps, use a spin mutex to interlock between PPS capture and PPS
ioctl(2) handling. This allows doing the pps_event() work in the polling
routine, instead of using a taskqueue task to do that work.
Also, add PNPINFO, and switch to using make_dev_s() to create the cdev.
Using a spin mutex and calling pps_event() from the polling function works
around the situation which requires more than 2 sets of timecounter
timehands in a single-core system to get reliable PPS capture. That problem
would happen when a single-core system is idle in cpu_idle() then gets woken
up with an event timer event which was scheduled to handle a hardclock tick.
That processing path would end up calling tc_windup 3 or 4 times between
when the tc polling function was called and when the taskqueue task would
eventually run, and with only two sets of timehands, the th_generation count
would always be too old to allow the captured PPS data to be used.
lualoader: Revert to ASCII menu frame for serial console
The box drawing characters we use aren't necessarily safe with a serial
console; for instance, in the report by npn@, these were causing his xterm
to send back a sequence that lua picked up as input and halted the boot.
This is less than ideal.
Fallback to ASCII frames for console with 'comconsole' in it. This is a
partial revert r338108 by imp@ -- instead of removing the menu entirely and
disabling color/cursor sequences, just reverting the default frame to ASCII
is enough to not break in this setup.
Reported by: npn
Triaged and recommended by: tsoome
cache: avoid excessive relocking on entry removal during lookup
Due to lock ordering issues (bucket lock held, vnode locks wanted) the code
starts with trylocking which in face of contention often fails. Prior to
the change it would loop back with a possible yield.
Instead note we know what locks are needed and can take them in the right
order, avoiding retries. Then we can safely re-lookup and see if the entry
we are looking for is still there.
On a 104-way box poudriere would result in constant retries during an 11h
run as seen in the vfs.cache.zap_and_exit_bucket_fail counter.
cache: change the formula for calculating lock array sizes
It used to be mp_ncpus * 64, but this gives unnecessarily big values for small
machines and at the same time constraints bigger ones. In particular this helps
on a 104-way box for which the count is now doubled.
While here make cache_purgevfs less likely. Currently it is not efficient in
face of contention due to lock ordering issues. These are fixable but not worth
it at the moment.
jeff [Tue, 10 Sep 2019 19:08:01 +0000 (19:08 +0000)]
Replace redundant code with a few new vm_page_grab facilities:
- VM_ALLOC_NOCREAT will grab without creating a page.
- vm_page_grab_valid() will grab and page in if necessary.
- vm_page_busy_acquire() automates some busy acquire loops.
jeff [Tue, 10 Sep 2019 18:27:45 +0000 (18:27 +0000)]
Use the sleepq lock rather than the page lock to protect against wakeup
races with page busy state. The object lock is still used as an interlock
to ensure that the identity stays valid. Most callers should use
vm_page_sleep_if_busy() to handle the locking particulars.
While here, mark all non-standard ones as FreeBSD-only as
other systems (at least, GNU/Linux and illumos) do not handle
them, so we should not encourage their use.
Compared to current version in base:
- great improvements on the Unicode support
- full support for filename completion including quoting
which means we do not need anymore our custom addition)
- Improved readline compatiblity
Upgrading libedit has been a pain in the past, because somehow we never
managed to properly cleanup the tree in lib/libedit and each merge has always
been very painful. After years of fighting give up and refresh a merge from
scrarch properly in contrib.
Note that the switch to this version will be done in another commit.
Remove mklocale(1) and colldef(1) which are deprecated since FreeBSD 11
In FreeBSD 11 along with the rework on the collation, mklocale(1) and colldef(1)
has been replaced by localedef(1) (a note has been added to the manpage to state
it).
mklocale(1) and colldef(1) has been kept around to be able to build older
versions of FreeBSD. None of the version requiring those tools are supported
anymore so it is time to remove them from base