Julien Grall [Mon, 5 Oct 2015 17:35:54 +0000 (18:35 +0100)]
xen/netfront: Add 2 bytes padding in the rx mbuf
The ethernet header size is not word aligned. Therefore the IP packet and so
on won't be align. On some architecture (such as ARM) unaligned access may
be slower and/or not supported. Therefore we might reveice an alignement
fault. To avoid this case, we need to pull-up the data of ETHER_ALIGN bytes.
I'm not sure how this patch will impact x86, we need to do some benchmarking
without and with it.
Furthermore, I don't know if m_copyup is the rigth function to use here. Can
any expert to the network stack can tell me if there is a better solution?
Julien Grall [Mon, 5 Oct 2015 17:35:43 +0000 (18:35 +0100)]
xen/blkfront: WRITE_BARRIER and FLUSH_DISKCACHE require barrier
For WRITE_BARRIER and FLUSH_DISKCACHE operation, we don't request any cache
operation. This will result to a panic in _bus_dmamap_sync on ARM because the
operation (op = 0) is not supported.
x86 platform doesn't seem to care about this and Xen is always requiring
memory shared with the backend to be cacheable. I'm wondering if we could drop
the call to bus_dmasync_map because the cache maintainance slows down the
process for no appareant reason?
For now, WRITE_BARRIER and FLUSH_DISKCACHE are an extension of the WRITE
command so require BUS_DMASYNC_PREWRITE for the cache maintenance operation.
Julien Grall [Sun, 4 Oct 2015 19:33:04 +0000 (20:33 +0100)]
xen: arm: Use __typeof__ rather than typeof
Typeof is not portable:
/usr/src/freebsd/sys/xen/hypervisor.h:93:2: error: implicit declaration
of function 'typeof' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
ngie [Mon, 2 Nov 2015 10:08:00 +0000 (10:08 +0000)]
Clean up mtree keyword support a slight bit and add a few more default keywords
- Parameterize the mtree keywords as $DEFAULT_MTREE_KEYWORDS
- Test with the extra mtree keywords, `mode,gid,uid`.
- Add a note about mtrees with time support not working with makefs right now
ngie [Mon, 2 Nov 2015 09:16:51 +0000 (09:16 +0000)]
Add testcases for -t cd9660 -o isolevel=[1-3]
-- -o isolevel=1 currently fails because of path comparison issues,
so mark it as an expected failure.
-- -o isolevel=3 is not implemented, so expect it to fail as an out
of bounds value [*].
adrian [Mon, 2 Nov 2015 03:36:15 +0000 (03:36 +0000)]
mips: rate limit the trap handler output; add pid/tid/program name.
I discovered that we're logging each trap, which gets pretty spendy;
and there wasn't any further information on the pid/tid/progname involved.
I originally noticed this because I don't attach anything to /dev/log and so
the log() output stays going to the kernel. That's an oops on my part, but
I'm glad I did it.
This commit adds the following:
* a rate limiter, which could do with some eyeballs/ideas on how to
make it more predictable on SMP;
* log pid, tid, progname (comm) as part of the output.
melifaro [Sun, 1 Nov 2015 19:59:04 +0000 (19:59 +0000)]
Fix lladdr change propagation for on vlans on top of it.
Fix lladdr update when setting mac address manually.
Fix lladdr_event for slave ports addition.
In pfctl_set_debug() we used 'level' without ever initialising it.
We correctly parsed the option, but them failed to actually assign the parsed
value to 'level' before performing to ioctl() to configure the debug level.
mmel [Sun, 1 Nov 2015 16:54:55 +0000 (16:54 +0000)]
Install myself as src committer.
Approved by: kib (mentor)
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> MFH: Ports tree branch name. Request approval for merge.
> Relnotes: Set to 'yes' for mention in release notes.
> Security: Vulnerability reference (one per line) or description.
> Sponsored by: If the change was sponsored by an organization.
> Differential Revision: https://reviews.freebsd.org/D### (*full* phabric URL needed).
> Empty fields above will be automatically removed.
cem [Sat, 31 Oct 2015 20:38:06 +0000 (20:38 +0000)]
ioat: Handle channel-fatal HW errors safely
Certain invalid operations trigger hardware error conditions. Error
conditions that only halt one channel can be detected and recovered by
resetting the channel. Error conditions that halt the whole device are
generally not recoverable.
Add a sysctl to inject channel-fatal HW errors,
'dev.ioat.<N>.force_hw_error=1'.
When a halt due to a channel error is detected, ioat(4) blocks new
operations from being queued on the channel, completes any outstanding
operations with an error status, and resets the channel before allowing
new operations to be queued again.
Update ioat.4 to document error recovery; document blockfill introduced
in r290021 while we are here; document ioat_put_dmaengine() added in
r289907; document DMA_NO_WAIT added in r289982.
bapt [Sat, 31 Oct 2015 09:45:11 +0000 (09:45 +0000)]
newsyslog: treat 'c' flag in the config as 'C'
When -C was introduced in r114137 the plan was to have -C and -c being used for
"create" due to a typo in FreeBSD <= 4.8 a temporary compatibility hack has been
added to make -c being like -G aka GLOB and a warning was issued for the user to
be aware of the futur change for -c.
12 years later it is more than time to remove that hack and finish the what was
intent in r114137
imp [Sat, 31 Oct 2015 04:53:07 +0000 (04:53 +0000)]
The error classification from lower layers is a poor indicator of
whether an error is recoverable. Always re-dirty the buffer on errors
from write requests. The invalidation we used to do for errors not EIO
doesn't need to be done for a device that's really gone, since that's
done in a different path.
adrian [Sat, 31 Oct 2015 00:29:26 +0000 (00:29 +0000)]
mips: do mips_sync() on sync operations to uncachable memory.
mips24k/mips74k document that we need an explicit SYNC so to order
things correctly, even with access to uncachable memory.
We were doing calls to SYNC in the cache ops (inv, wbinv) but we
weren't doing it for uncachable memory.
adrian [Sat, 31 Oct 2015 00:04:44 +0000 (00:04 +0000)]
mips74k: use cache-writeback for memory, not writethrough.
When I ported this code from netbsd I was .. slightly mips74k greener.
I used writethrough because (a) it's what netbsd did, and (b) if I used
writethrough then things "didn't work."
Fast-forward a couple years, more MIPS hacking and a whole lot more
understanding of the bus APIs (the last few commits notwithstanding;
it's been a long week, ok?) and I have this working for arge,
argemdio, spi and ath. Hans has it working for USB. The ath barrier
code will come in a later commit.
This gets the routing throughput up from 220mbit -> 337mbit.
I'm sure the bridging throughput will be similarly improved.
adrian [Fri, 30 Oct 2015 23:18:02 +0000 (23:18 +0000)]
arge: attempt to close a transmit race by only enabling the descriptor at the end of setup.
This driver and the linux ag71xx driver both treat the transmit ring
as a circular linked list of descriptors. There's no "end" pointer
that is ever NULL - instead, it expects the MAC to hit a finished
descriptor (ARGE_DESC_EMPTY) and stop.
Now, since it's a circular buffer, we may end up with the hardware
hitting the beginning of our multi-descriptor frame before we've finished
setting it up. It then DMA's it in, starts sending it, and we finish
writing out the new descriptor. The hardware may then write its
completion for the next descriptor out; then we do, and when we next
read it it'll show up as "not done" and transmit completion stops.
This unfortunately manifests itself as the transmit queue always
being active and a massive TX interrupt storm. We need to actively
ACK packets back from the transmit engine and if we don't (eg because
we think the transmit isn't finished but it is) then the unit will
just keep generating interrupts.
I hit this finally with the below testing setup. This fixed it for me.
Strictly speaking I should put in a sync in between writing out all of
the descriptors and writing out that final descriptor.
Tested:
* QCA9558 SoC (AP135 reference board) w/ arge1 + vlans acting as a
router, and iperf -d (tcp, bidirectional traffic.)
adrian [Fri, 30 Oct 2015 23:07:32 +0000 (23:07 +0000)]
arge: do an explicit flush between updating the TX ring and starting transmit.
The MIPS busdma sync operations currently are a big no-op on coherent memory.
This isn't strictly correct behaviour as we need a SYNC in here to ensure that
the writes have finished and are visible in main memory before the MMIO accesses
occur. This will have to be addressed in a later commit.
But, before that happens, let's at least do a flush here to make things
more "correct".
This is required for even remotely sensible behaviour on mips74k with
write-through memory enabled.
adrian [Fri, 30 Oct 2015 22:55:41 +0000 (22:55 +0000)]
arge: ensure there's enough space in the TX ring before attempting to
send frames.
This matches the other check for space.
"enough" is a misnomer, for "reasons". The biggest reason is that
the TX ring is actually a circular linked list, with no head/tail pointers.
This is just a bit more headroom between head/tail so we have time to
schedule frames before we hit where the hardware is at.
Ideally this would be tunable and a little larger.
adrian [Fri, 30 Oct 2015 22:53:30 +0000 (22:53 +0000)]
arge: do a read-after-write on all arge register writes, not just MDIO writes.
This flushes out the write to the system before anything continues.
The mips74k guide, chapter 3.3.3 (write gathering) notes that writes
can be buffered in FIFOs - even uncached ones - so we can't guarantee
the device has felt its effects. Now, since we're all lazy driver
authors and don't pepper read/write barriers everywhere, fake it here.
sbruno [Fri, 30 Oct 2015 17:05:52 +0000 (17:05 +0000)]
Not all targets support by clang have a tested or enabled ubsan yet.
Only enable h_raw on x86 targets for today so that a buildworld runs to
completion for clang enabled targets that are not x86. This should be
removed when validation of the sanitizer has occured for all targets
supported by FreeBSD and clang.
jimharris [Fri, 30 Oct 2015 16:06:34 +0000 (16:06 +0000)]
nvme: fix race condition in split bio completion path
Fixes race condition observed under following circumstances:
1) I/O split on 128KB boundary with Intel NVMe controller.
Current Intel controllers produce better latency when
I/Os do not span a 128KB boundary - even if the I/O size
itself is less than 128KB.
2) Per-CPU I/O queues are enabled.
3) Child I/Os are submitted on different submission queues.
4) Interrupts for child I/O completions occur almost
simultaneously.
5) ithread for child I/O A increments bio_inbed, then
immediately is preempted (rendezvous IPI, higher priority
interrupt).
6) ithread for child I/O B increments bio_inbed, then completes
parent bio since all children are now completed.
7) parent bio is freed, and immediately reallocated for a VFS
or gpart bio (including setting bio_children to 1 and
clearing bio_driver1).
8) ithread for child I/O A resumes processing. bio_children
for what it thinks is the parent bio is set to 1, so it
thinks it needs to complete the parent bio.
Result is either calling a NULL callback function, or double freeing
the bio to its uma zone.
trasz [Fri, 30 Oct 2015 15:52:10 +0000 (15:52 +0000)]
After r290196, the kernel won't wait for stuff like gmirror nodes
if they are not required for mounting rootfs. However, it's possible
that some setups try to mount them in mountcritlocal (ie from fstab).
Export the list of current root mount holds using a new sysctl,
vfs.root_mount_hold, and make mountcritlocal retry if "mount -a" fails
and the list is not empty.
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3709
hselasky [Fri, 30 Oct 2015 14:50:29 +0000 (14:50 +0000)]
Reduce the DWC OTG interrupt load by not reading all the host channel
status registers for every interrupt. Check a common host channel
status interrupt register first, then conditionally read the
individual host channel status registers.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after: 1 week
zbb [Fri, 30 Oct 2015 12:21:37 +0000 (12:21 +0000)]
Workaround KGDB issues on ARM by ignoring ARM EABI version higher than 5
To make KGDB working, it needs to understand kernel ELF image.
By default it is compiled using EABI_5, which is not supported
on the gdb-6. As a workaround, treat these images as EABI_2 because
they share a lot of things in common.
This workaround does not guarantee ALL funtionalities
to work.
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: Juniper Networks Inc.
Differential Revision: https://reviews.freebsd.org/D4012