royger [Mon, 19 Oct 2015 14:20:06 +0000 (14:20 +0000)]
xen-netfront: purge page flipping support
Currently neither Linux nor FreeBSD netback supports page flipping. NetBSD
still supports that. It is not sure how many people actually use page
flipping, but page flipping is supposed to be slower than copying nowadays.
It will also shatter frontend / backend address space.
Overall this feature is more of a burden than a benefit.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3888
Sponsored by: Citrix Systems R&D
andrew [Mon, 19 Oct 2015 13:20:23 +0000 (13:20 +0000)]
Use 4 levels of page tables when enabling the MMU. This will allow us to
boot on an SoC that places physical memory at an address past where three
levels of page tables can access in an identity mapping.
Submitted by: Wojciech Macek <wma@semihalf.com>,
Patrick Wildt <patrick@bitrig.org>
Differential Revision: https://reviews.freebsd.org/D3885 (partial)
Differential Revision: https://reviews.freebsd.org/D3744
hselasky [Mon, 19 Oct 2015 12:44:41 +0000 (12:44 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Redefine DIV_ROUND_UP as a function macro taking two arguments
instead of none.
- Implement more Linux kernel functions related to various forms
of DELAY() and basic mathematical operations.
hselasky [Mon, 19 Oct 2015 12:26:38 +0000 (12:26 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.
hselasky [Mon, 19 Oct 2015 11:57:33 +0000 (11:57 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Add more list related functions and macros.
- Update the hlist_for_each_entry() macro to take one less argument.
hselasky [Mon, 19 Oct 2015 11:46:48 +0000 (11:46 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Reimplement ktime header file to distinguish more from Linux.
- Add new time header file to handle time related Linux functions.
hselasky [Mon, 19 Oct 2015 11:09:51 +0000 (11:09 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Avoid using PAGE_MASK, because Linux defines it differently.
Use (PAGE_SIZE - 1) instead.
- Add support for for_each_sg_page() and sg_page_iter_dma_address().
hselasky [Mon, 19 Oct 2015 10:54:24 +0000 (10:54 +0000)]
Merge LinuxKPI changes from DragonflyBSD:
- Added support for multiple new Linux functions.
- Properly implement DEFINE_WAIT() and init_waitqueue_head() macros.
- Removed FreeBSD specific __wait_queue_head structure definition.
adrian [Mon, 19 Oct 2015 01:21:29 +0000 (01:21 +0000)]
otus(4) - use the local node alloc function so there's space for statistics.
* Use the correct malloc type for node allocation - M_80211_NODE - so
the default node free method in net80211 will work correctly.
* Fix otus_node_alloc() to suit FreeBSD's net80211.
* .. and actually call otus_node_alloc() so there's space for the
per-node tx statistics. Otherwise, well, it will be scribbling over
random memory.
adrian [Mon, 19 Oct 2015 01:14:26 +0000 (01:14 +0000)]
otus(4) - add initial monitor mode; use lowest rate for EAPOL
The monitor mode stuff is from the openbsd driver, but it doesn't
100% work. It doesn't seem to get all frames for all BSSes.
However, it's enough to at start debugging things. That 0xffffffff
write is /I think/ the RX filter, but I am still not 100% sure about
it all.
Then, whilst here, use the lowest rate for EAPOL frames. This is just
generally a good thing to do.
zbb [Sun, 18 Oct 2015 22:10:08 +0000 (22:10 +0000)]
Introduce driver for Cavium's ThunderX MDIO
This commit adds support for MDIO present in the ThunderX SoC.
From the FDT point of view it is compatible with "octeon-3860-mdio"
however only C22 mode is used.
The code also implements lmac_if interface functions.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
zbb [Sun, 18 Oct 2015 22:02:58 +0000 (22:02 +0000)]
Introduce initial support for Cavium's ThunderX networking interface
- The driver consists of three main componens: PF, VF, BGX
- Requires appropriate entries in DTS and MDIO driver
- Supports only FDT configuration
- Multiple Tx queues and single Rx queue supported
- No RSS, HW checksum and TSO support
- No more than 8 queues per-IF (only one Queue Set per IF)
- HW statistics enabled
- Works in all available MAC modes (1,10,20,40G)
- Style converted to BSD according to style(9)
- The code brings lmac_if interface used by the BGX driver to
update its logical MACs state.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
zbb [Sun, 18 Oct 2015 21:39:15 +0000 (21:39 +0000)]
Raw import of ThunderX VNIC networking driver components
This import brings following components of the Linux driver:
- Thunder BGX (programmable MAC)
- Physical Function driver
- Virtual Function driver
- Headers
Revision: 1.0
Obtained from: Cavium
License information: Cavium provided these files under BSD license
ian [Sun, 18 Oct 2015 20:37:10 +0000 (20:37 +0000)]
Only decode fdt data which belongs to the GIC controller.
The interrupts-extended property is a list of controller-specific
interrupt tuples for more than one controller. The decode routine of
every PIC gets called in the pre-INTRNG code (nexus doesn't know which
device instance belongs to which fdt node), so the GIC code has to
check each FDT node it is asked to decode to ensure it is the owner.
Because in the pre-INTRNG world there can only be one instance of a GIC,
it's safe to cache the results of a positive lookup in a static variable
to avoid the expensive lookups on subsequent calls.
cem [Sun, 18 Oct 2015 20:20:57 +0000 (20:20 +0000)]
if_ntb: MFV e26a5843: Move MW/DB management to if_ntb
This is the last e26a5843 patch. The general thrust of the rewrite was
to move more responsibility for Memory Window and Doorbell interrupt
management from the ntb_hw driver to if_ntb.
A number of APIs have been added, removed, or replaced. The old
DB callback mechanism has been excised. Instead, callers (if_ntb) are
responsible for configuring MWs and handling their interrupts more
directly.
This adds a tunable, hw.ntb.max_mw_size, allowing users to limit the
size of memory windows used by if_ntb (identical to the Linux modparam
of the same name).
Despite attempts to keep mechanical name changes to separate commits,
some have snuck in here. At least the driver should be much more
similar to the latest Linux one now -- making porting fixes easier.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
cem [Sun, 18 Oct 2015 20:20:29 +0000 (20:20 +0000)]
NTB: Flesh out the rest of the xeon_setup_b2b_mw changes
Move all Xeon secondary register setup to the setup_b2b_mw routine. We
use subroutines to make it a bit less wordy than the Linux version.
Adds a new tunable, 'hw.ntb.b2b_mw_share'. By default, it is off
(zero). If both sides enable it (any non-zero value), the NTB driver
attempts to use only half of a memory window for remote register MMIO
access.
This is still part of the large Linux rewrite (e26a5843).
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
cem [Sun, 18 Oct 2015 20:20:20 +0000 (20:20 +0000)]
NTB: "Split ntb_hw_intel and ntb_transport drivers"
This Linux commit was more or less a rewrite. Unfortunately, the commit
log does not give a lot of context for the rewrite. I have tried to
faithfully follow the changes made upstream, including matching function
names where possible, while churning the FreeBSD driver as little as
possible.
This is the bulk of the rewrite. There are two groups of changes to
follow in separate commits: fleshing out the rest of the changes to
xeon_setup_b2b_mw(), and some changes to if_ntb.
Yes, this is a big patch (3 files changed, 416 insertions(+), 237
deletions(-)), but the Linux patch was 13 files changed, 2,589
additions(+) and 2,195 deletions(-).
Original Linux commit log:
Change ntb_hw_intel to use the new NTB hardware abstraction layer.
Split ntb_transport into its own driver. Change it to use the new NTB
hardware abstraction layer.
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
cem [Sun, 18 Oct 2015 20:19:53 +0000 (20:19 +0000)]
NTB: Rename some variables/functions to match Linux
No functional change.
Still part of the huge e26a5843 rewrite. I'm trying to make it less of
a complete rewrite in the FreeBSD version of the driver. Still, it
helps if our names match Linux.
ian [Sun, 18 Oct 2015 19:54:11 +0000 (19:54 +0000)]
Enable ARM_INTRNG on IMX6 platforms, and make the imx_gpio driver an
interrupt controller.
The latter is required for INTRNG, because of the hardware erratum
workaround installed by the linux folks into the imx6 FDT data, which remaps
an ethernet interrupt to the gpio device. In the non-INTRNG world we
intercept the call to map the interrupt and map it back to the ethernet
hardware (because we don't need linux's workaround), but in the INTRNG world
we lose the hookpoint where that remapping was happening, but we gain the
ability to work the way linux does by having the gpio driver dispatch the
interrupt.
mav [Sun, 18 Oct 2015 19:05:56 +0000 (19:05 +0000)]
MFV r289535: 5767 fix several problems with zfs test suite
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: John Wren Kennedy <john.kennedy@delphix.com>
mav [Sun, 18 Oct 2015 18:32:22 +0000 (18:32 +0000)]
MFV r289530: 5847 libzfs_diff should check zfs_prop_get() return
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alexander Eremin <a.eremin@nexenta.com>
ian [Sun, 18 Oct 2015 18:26:19 +0000 (18:26 +0000)]
Import ARM_INTRNG, the "next generation" interrupt architecture for arm
and armv6 architecures. The primary enhancement over the old design is
support for hierarchical interrupt controllers (such as a gpio driver
which can receive interrupts from a root PIC and act as a PIC itself for
clients interested in handling a change of gpio pin state as an
interrupt). The new code also provides an infrastructure for mapping
interrupts described in metadata in the form of a "controller reference
plus interrupt number" tuple into the simple "0-n" flat numeric space
understood by rman and the bus resource mechanisms.
Use of the new code is enabled by setting the ARM_INTRNG option, and by
making a few simple changes to the platform's support code. In addition
each existing PIC driver needs changes to be ready for INTRNG; this commit
contains the changes for the arm/gic driver, which most armv6 SoCs use, but
it does not enable the new code yet on any platform.
This project has been many years in the making, starting as a GSoC project
by Jakub Klama (jceel@) in 2012. That didn't get committed right away and
the source base evolved out from under it to some degree. In 2014 I rebased
the diffs to then -current and did some enhancements in the area of mapping
interrupt numbers and storing associated fdt data, then the project went
cold again for a while. Eventually Svata Kraus took that work in progress
and did another big round of work on it, removing most of the remaining
rough edges. Finally I took that and made one more pass through it, mostly
disabling the "INTR_SOLO" feature for now, pending further design
discussions on how to most efficiently dispatch a pending interrupt through
more than one layer of PIC. The current code with the INTR_SOLO feature
disabled uses approximate 100 extra cpu cycles for each cascaded PIC the
interrupt has to be passed to, so what's left to do is about efficiency, not
correct operation.
mav [Sun, 18 Oct 2015 18:08:33 +0000 (18:08 +0000)]
MFV r289526:
5561 support root pools on EFI/GPT partitioned disks
5125 update zpool/libzfs to manage bootable whole disk pools (EFI/GPT labeled disks)
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
dim [Sun, 18 Oct 2015 17:18:19 +0000 (17:18 +0000)]
Switch the default OpenMP runtime for clang to libomp (from the LLVM
project), as libgomp is not supported anyway. You can use the
devel/llvm-devel port to install a recent copy of the OpenMP runtime.
ian [Sun, 18 Oct 2015 16:54:34 +0000 (16:54 +0000)]
Rename arm_init_secondary_ic() -> arm_pic_init_secondary(). The latter is
the name the function will have when the new ARM_INTRNG code is integrated,
and doing this rename first will make it easier to toggle the new interrupt
handling code on/off with a config option for debugging.
andrew [Sun, 18 Oct 2015 13:23:21 +0000 (13:23 +0000)]
Correctly align the stack. The early csu assumed we passed the aux vector
in through the stack pointer, however this may have been misaligned
causing some userland applications to crash. A workaround was committed in
r284707 where userland would check if the aux vector was passed using the
old or new ABI and adjust the stack if needed. As 4 months have passed it
is time to move to the new ABI, with the expectation the compat code in csu
and the runtime linker to be removed in the future.
melifaro [Sun, 18 Oct 2015 12:26:25 +0000 (12:26 +0000)]
Fix deletion of ifaddr lle entries when deleting prefix from interface in
down state.
Regression appeared in r287789, where the "prefix has no corresponding
installed route" case was forgotten. Additionally, lltable_delete_addr()
was called with incorrect byte order (default is network for lltable code).
While here, improve comments on given cases and byte order.
mav [Sun, 18 Oct 2015 11:44:31 +0000 (11:44 +0000)]
MFC r289498: 6298 zfs_create_008_neg and zpool_create_023_neg need to be
updated for large block support.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Joe Stein <joe.stein@delphix.com>
mav [Sun, 18 Oct 2015 11:36:43 +0000 (11:36 +0000)]
MFV r247180: Update vendor/illumos/dist and vendor-sys/illumos/dist
to illumos-gate 13967:92bec6d87f59
Illumos ZFS issues:
3557 dumpvp_size is not updated correctly when a dump zvol's size is
changed
3558 setting the volsize on a dump device does not return back ENOSPC
3559 setting a volsize larger than the space available sometimes succeeds
mav [Sun, 18 Oct 2015 11:21:08 +0000 (11:21 +0000)]
MFV r289493: 5745 zfs set allows only one dataset property to be set at a time
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Reviewed by: Richard PALO <richard@NetBSD.org>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Approved by: Rich Lowe <richlowe@richlowe.net>
Author: Chris Williamson <chris.williamson@delphix.com>
kib [Sun, 18 Oct 2015 09:33:28 +0000 (09:33 +0000)]
Only marker is guaranteed to be present on the queue after the relock
in vm_pageout_fallback_object_lock() and vm_pageout_page_lock(). The
check for the m->queue == queue assumes that the page does belong to a
queue.
Modify the 'unchanged' calculation bu dereferencing the marker tailq
pointers, which is known to belong to the queue. Since for a page m
linked to the queue, m->queue must be equal to the queue index, assert
this instead of checking.
In collaboration with: alc
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 2 weeks
ngie [Sun, 18 Oct 2015 04:07:40 +0000 (04:07 +0000)]
Only enable -fstack-protector-strong on gcc 4.9+ and default to -fstack-protector
when -fstack-protector-strong is not available, like it was implicitly before
r288669
As noted by antoine@, devel/gcc (which is 4.8.5) lacks -fstack-protector-strong
support, whereas 4.8.4i (devel/gcc48) has the support.
Until a version is available which has -fstack-protector-strong support, be
conservative and only enable support with 4.9+.
ian [Sun, 18 Oct 2015 01:03:43 +0000 (01:03 +0000)]
Fix a strange macro re-definition compile error. If the VM_MAXUSER_ADDRESS
value is defined as a config option the definition is emitted into
opt_global.h which is force-included into everything. In addition, the
symbol is emitted by the genassym mechanism, but that by its nature reduces
the value to a 0xnnnnnnnn number. When compiling a .S file you end up
with two different definitions of the macro (they evaluate to the same
number, but the text is different, upsetting the compiler).
Nothing has changed about this code for a while but the compile error is
new, so this must be fallout from the clang 3.7 update or something.
adrian [Sun, 18 Oct 2015 00:59:28 +0000 (00:59 +0000)]
if_arge: fix up TX workaround; add TX/RX requirements for busdma; add stats
The early ethernet MACs (I think AR71xx and AR913x) require that both
TX and RX require 4-byte alignment for all packets.
The later MACs have started relaxing the requirements.
For now, the 1-byte TX and 1-byte RX alignment requirements are only for
the QCA955x SoCs. I'll add in the relaxed requirements as I review the
datasheets and do testing.
* Add a hardware flags field and 1-byte / 4-byte TX/RX alignment.
* .. defaulting to 4-byte TX and 4-byte RX alignment.
* Only enforce the TX alignment fixup if the hardware requires a 4-byte
TX alignment. This avoids a call to m_defrag().
* Add counters for various situations for further debugging.
* Set the 1-byte and 4-byte busdma alignment requirement when
the tag is created.
This improves the straight bridging performance from 130mbit/sec
to 180mbit/sec, purely by removing the need for TX path bounce buffers.
The main performance issue is the RX alignment requirement and any RX
bounce buffering that's occuring. (In a local test, removing the RX
fixup path and just aligning buffers raises the performance to above
400mbit/sec.
In theory it's a no-op for SoCs before the QCA955x.
Tested:
* QCA9558 SoC in AP135 board, using software bridging between arge0/arge1.
andrew [Sat, 17 Oct 2015 19:52:17 +0000 (19:52 +0000)]
Replace build_section_pagetable with build_l1_block_pagetable as it takes
an extra argument to specify the number of 1GiB pages to map. This should
be a nop as we are only mapping a single page, but when we move to use an
extra level of page tables we will be able to map a second block, e.g. if
the kernel was loaded over a 1GiB boundary.
ngie [Sat, 17 Oct 2015 19:48:17 +0000 (19:48 +0000)]
Only use -fstack-protector-strong with supported compilers
This includes clang 3.5.0+, gcc 4.2.1, gcc 4.8.4+
This allows me to do subdirectory makes again after setting
MAKESYSPATH on 10.2-RELEASE as it comes with clang 3.4.1.
As a sidenote: this isn't technically correct for all vintages
of gcc 4.2.1, but will be correct when gcc is rebuilt/reinstalled
after r286074, so this version check should be good enough.
bdrewery [Sat, 17 Oct 2015 18:22:18 +0000 (18:22 +0000)]
Fix wrong PATH being set for world 'includes' stage after r289438.
The 'includes' target is currently a pseudo target in bsd.subdir.mk that
does 'cd ${.CURDIR} && ${MAKE} buildincludes && ${MAKE} installincludes',
versus all over targets that just recurse.
In Makefile.inc1 the older duplicated bsd.subdir.mk logic for calling
'includes' was being executed in each subdir directly, meaning 'cd lib && make
includes' became 'cd lib && make buildincludes && make installincludes'. Now
that the bsd.subdir.mk logic is used it is calling 'make buildincludes && make
installincludes' from the top-level which pulls in the PATH=<default path>
from /Makefile.
The sub-make logic for 'includes' in bsd.subdir.mk was attempted to be removed
in r289282 but turned out to be wrong. I have a working version now but
it is not yet ready for commit. So for now in Makefile.inc1 split out
'includes' to 'buildincludes' and 'installincludes' which will avoid the
problem.
bdrewery [Sat, 17 Oct 2015 16:42:54 +0000 (16:42 +0000)]
Rework the 'make -n -n' feature such that '-n' recurses and '-N' does not.
Bmake has a documented feature of '-N' to skip executing commands which is
specifically intended for debugging top-level builds and not recursing into
sub-directories. This matches the older 'make -n' behavior we added which made
'-n -n' the recursing target and '-n' a non-recursing target.
Removing the '-n -n' feature allows the build to work as documented in
the bmake manpage with '-n' and '-N'. The older '-n -n' feature was also
not documented anywhere that I could see.
Note that the ${_+_} var is still needed as currently bmake incorrectly
executes '+' commands when '-N' is specified.
The '-n' and '-n -n' features were broken for several reasons prior to this.
r251748 made '_+_' never expand with '-n -n' which resulted in many
sub-directories not being visited until fixed 2 years later in r288391, and
many targets were given .MAKE over the past few years which resulted in
non-sub-make commands, such as rm and ln and mtree, to be executed.
This should also allow removing some indirection hacks in bsd.subdir.mk and
other cases of .USE that have a .MAKE by using '+'.
ngie [Sat, 17 Oct 2015 09:07:53 +0000 (09:07 +0000)]
Set dev->fd to -1 when calling cam_close_spec_device with a valid dev->fd
descriptor to avoid trashing valid file descriptors that access dev->fd at a
later point in time
ngie [Sat, 17 Oct 2015 08:39:37 +0000 (08:39 +0000)]
Integrate tools/regression/acltools into the FreeBSD test suite as tests/sys/acl
- Make the requirements more complete for the testcases
- Detect prerequisites so the tests won't fail (zfs.ko is loaded, zpool(1)
is available, ACL support is enabled with UFS, etc).
- Work with temporary files/directories/mountpoints that work with atf/kyua
- Limit the testcases to work on temporary filesystems to reduce tainting the
test host