cem [Sun, 18 Oct 2015 20:20:29 +0000 (20:20 +0000)]
NTB: Flesh out the rest of the xeon_setup_b2b_mw changes
Move all Xeon secondary register setup to the setup_b2b_mw routine. We
use subroutines to make it a bit less wordy than the Linux version.
Adds a new tunable, 'hw.ntb.b2b_mw_share'. By default, it is off
(zero). If both sides enable it (any non-zero value), the NTB driver
attempts to use only half of a memory window for remote register MMIO
access.
This is still part of the large Linux rewrite (e26a5843).
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
cem [Sun, 18 Oct 2015 20:20:20 +0000 (20:20 +0000)]
NTB: "Split ntb_hw_intel and ntb_transport drivers"
This Linux commit was more or less a rewrite. Unfortunately, the commit
log does not give a lot of context for the rewrite. I have tried to
faithfully follow the changes made upstream, including matching function
names where possible, while churning the FreeBSD driver as little as
possible.
This is the bulk of the rewrite. There are two groups of changes to
follow in separate commits: fleshing out the rest of the changes to
xeon_setup_b2b_mw(), and some changes to if_ntb.
Yes, this is a big patch (3 files changed, 416 insertions(+), 237
deletions(-)), but the Linux patch was 13 files changed, 2,589
additions(+) and 2,195 deletions(-).
Original Linux commit log:
Change ntb_hw_intel to use the new NTB hardware abstraction layer.
Split ntb_transport into its own driver. Change it to use the new NTB
hardware abstraction layer.
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
cem [Sun, 18 Oct 2015 20:19:53 +0000 (20:19 +0000)]
NTB: Rename some variables/functions to match Linux
No functional change.
Still part of the huge e26a5843 rewrite. I'm trying to make it less of
a complete rewrite in the FreeBSD version of the driver. Still, it
helps if our names match Linux.
ian [Sun, 18 Oct 2015 19:54:11 +0000 (19:54 +0000)]
Enable ARM_INTRNG on IMX6 platforms, and make the imx_gpio driver an
interrupt controller.
The latter is required for INTRNG, because of the hardware erratum
workaround installed by the linux folks into the imx6 FDT data, which remaps
an ethernet interrupt to the gpio device. In the non-INTRNG world we
intercept the call to map the interrupt and map it back to the ethernet
hardware (because we don't need linux's workaround), but in the INTRNG world
we lose the hookpoint where that remapping was happening, but we gain the
ability to work the way linux does by having the gpio driver dispatch the
interrupt.
mav [Sun, 18 Oct 2015 19:05:56 +0000 (19:05 +0000)]
MFV r289535: 5767 fix several problems with zfs test suite
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: John Wren Kennedy <john.kennedy@delphix.com>
mav [Sun, 18 Oct 2015 18:32:22 +0000 (18:32 +0000)]
MFV r289530: 5847 libzfs_diff should check zfs_prop_get() return
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alexander Eremin <a.eremin@nexenta.com>
ian [Sun, 18 Oct 2015 18:26:19 +0000 (18:26 +0000)]
Import ARM_INTRNG, the "next generation" interrupt architecture for arm
and armv6 architecures. The primary enhancement over the old design is
support for hierarchical interrupt controllers (such as a gpio driver
which can receive interrupts from a root PIC and act as a PIC itself for
clients interested in handling a change of gpio pin state as an
interrupt). The new code also provides an infrastructure for mapping
interrupts described in metadata in the form of a "controller reference
plus interrupt number" tuple into the simple "0-n" flat numeric space
understood by rman and the bus resource mechanisms.
Use of the new code is enabled by setting the ARM_INTRNG option, and by
making a few simple changes to the platform's support code. In addition
each existing PIC driver needs changes to be ready for INTRNG; this commit
contains the changes for the arm/gic driver, which most armv6 SoCs use, but
it does not enable the new code yet on any platform.
This project has been many years in the making, starting as a GSoC project
by Jakub Klama (jceel@) in 2012. That didn't get committed right away and
the source base evolved out from under it to some degree. In 2014 I rebased
the diffs to then -current and did some enhancements in the area of mapping
interrupt numbers and storing associated fdt data, then the project went
cold again for a while. Eventually Svata Kraus took that work in progress
and did another big round of work on it, removing most of the remaining
rough edges. Finally I took that and made one more pass through it, mostly
disabling the "INTR_SOLO" feature for now, pending further design
discussions on how to most efficiently dispatch a pending interrupt through
more than one layer of PIC. The current code with the INTR_SOLO feature
disabled uses approximate 100 extra cpu cycles for each cascaded PIC the
interrupt has to be passed to, so what's left to do is about efficiency, not
correct operation.
mav [Sun, 18 Oct 2015 18:08:33 +0000 (18:08 +0000)]
MFV r289526:
5561 support root pools on EFI/GPT partitioned disks
5125 update zpool/libzfs to manage bootable whole disk pools (EFI/GPT labeled disks)
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
dim [Sun, 18 Oct 2015 17:18:19 +0000 (17:18 +0000)]
Switch the default OpenMP runtime for clang to libomp (from the LLVM
project), as libgomp is not supported anyway. You can use the
devel/llvm-devel port to install a recent copy of the OpenMP runtime.
ian [Sun, 18 Oct 2015 16:54:34 +0000 (16:54 +0000)]
Rename arm_init_secondary_ic() -> arm_pic_init_secondary(). The latter is
the name the function will have when the new ARM_INTRNG code is integrated,
and doing this rename first will make it easier to toggle the new interrupt
handling code on/off with a config option for debugging.
andrew [Sun, 18 Oct 2015 13:23:21 +0000 (13:23 +0000)]
Correctly align the stack. The early csu assumed we passed the aux vector
in through the stack pointer, however this may have been misaligned
causing some userland applications to crash. A workaround was committed in
r284707 where userland would check if the aux vector was passed using the
old or new ABI and adjust the stack if needed. As 4 months have passed it
is time to move to the new ABI, with the expectation the compat code in csu
and the runtime linker to be removed in the future.
melifaro [Sun, 18 Oct 2015 12:26:25 +0000 (12:26 +0000)]
Fix deletion of ifaddr lle entries when deleting prefix from interface in
down state.
Regression appeared in r287789, where the "prefix has no corresponding
installed route" case was forgotten. Additionally, lltable_delete_addr()
was called with incorrect byte order (default is network for lltable code).
While here, improve comments on given cases and byte order.
mav [Sun, 18 Oct 2015 11:44:31 +0000 (11:44 +0000)]
MFC r289498: 6298 zfs_create_008_neg and zpool_create_023_neg need to be
updated for large block support.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Joe Stein <joe.stein@delphix.com>
mav [Sun, 18 Oct 2015 11:36:43 +0000 (11:36 +0000)]
MFV r247180: Update vendor/illumos/dist and vendor-sys/illumos/dist
to illumos-gate 13967:92bec6d87f59
Illumos ZFS issues:
3557 dumpvp_size is not updated correctly when a dump zvol's size is
changed
3558 setting the volsize on a dump device does not return back ENOSPC
3559 setting a volsize larger than the space available sometimes succeeds
mav [Sun, 18 Oct 2015 11:21:08 +0000 (11:21 +0000)]
MFV r289493: 5745 zfs set allows only one dataset property to be set at a time
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Reviewed by: Richard PALO <richard@NetBSD.org>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Approved by: Rich Lowe <richlowe@richlowe.net>
Author: Chris Williamson <chris.williamson@delphix.com>
kib [Sun, 18 Oct 2015 09:33:28 +0000 (09:33 +0000)]
Only marker is guaranteed to be present on the queue after the relock
in vm_pageout_fallback_object_lock() and vm_pageout_page_lock(). The
check for the m->queue == queue assumes that the page does belong to a
queue.
Modify the 'unchanged' calculation bu dereferencing the marker tailq
pointers, which is known to belong to the queue. Since for a page m
linked to the queue, m->queue must be equal to the queue index, assert
this instead of checking.
In collaboration with: alc
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 2 weeks
ngie [Sun, 18 Oct 2015 04:07:40 +0000 (04:07 +0000)]
Only enable -fstack-protector-strong on gcc 4.9+ and default to -fstack-protector
when -fstack-protector-strong is not available, like it was implicitly before
r288669
As noted by antoine@, devel/gcc (which is 4.8.5) lacks -fstack-protector-strong
support, whereas 4.8.4i (devel/gcc48) has the support.
Until a version is available which has -fstack-protector-strong support, be
conservative and only enable support with 4.9+.
ian [Sun, 18 Oct 2015 01:03:43 +0000 (01:03 +0000)]
Fix a strange macro re-definition compile error. If the VM_MAXUSER_ADDRESS
value is defined as a config option the definition is emitted into
opt_global.h which is force-included into everything. In addition, the
symbol is emitted by the genassym mechanism, but that by its nature reduces
the value to a 0xnnnnnnnn number. When compiling a .S file you end up
with two different definitions of the macro (they evaluate to the same
number, but the text is different, upsetting the compiler).
Nothing has changed about this code for a while but the compile error is
new, so this must be fallout from the clang 3.7 update or something.
adrian [Sun, 18 Oct 2015 00:59:28 +0000 (00:59 +0000)]
if_arge: fix up TX workaround; add TX/RX requirements for busdma; add stats
The early ethernet MACs (I think AR71xx and AR913x) require that both
TX and RX require 4-byte alignment for all packets.
The later MACs have started relaxing the requirements.
For now, the 1-byte TX and 1-byte RX alignment requirements are only for
the QCA955x SoCs. I'll add in the relaxed requirements as I review the
datasheets and do testing.
* Add a hardware flags field and 1-byte / 4-byte TX/RX alignment.
* .. defaulting to 4-byte TX and 4-byte RX alignment.
* Only enforce the TX alignment fixup if the hardware requires a 4-byte
TX alignment. This avoids a call to m_defrag().
* Add counters for various situations for further debugging.
* Set the 1-byte and 4-byte busdma alignment requirement when
the tag is created.
This improves the straight bridging performance from 130mbit/sec
to 180mbit/sec, purely by removing the need for TX path bounce buffers.
The main performance issue is the RX alignment requirement and any RX
bounce buffering that's occuring. (In a local test, removing the RX
fixup path and just aligning buffers raises the performance to above
400mbit/sec.
In theory it's a no-op for SoCs before the QCA955x.
Tested:
* QCA9558 SoC in AP135 board, using software bridging between arge0/arge1.
andrew [Sat, 17 Oct 2015 19:52:17 +0000 (19:52 +0000)]
Replace build_section_pagetable with build_l1_block_pagetable as it takes
an extra argument to specify the number of 1GiB pages to map. This should
be a nop as we are only mapping a single page, but when we move to use an
extra level of page tables we will be able to map a second block, e.g. if
the kernel was loaded over a 1GiB boundary.
ngie [Sat, 17 Oct 2015 19:48:17 +0000 (19:48 +0000)]
Only use -fstack-protector-strong with supported compilers
This includes clang 3.5.0+, gcc 4.2.1, gcc 4.8.4+
This allows me to do subdirectory makes again after setting
MAKESYSPATH on 10.2-RELEASE as it comes with clang 3.4.1.
As a sidenote: this isn't technically correct for all vintages
of gcc 4.2.1, but will be correct when gcc is rebuilt/reinstalled
after r286074, so this version check should be good enough.
bdrewery [Sat, 17 Oct 2015 18:22:18 +0000 (18:22 +0000)]
Fix wrong PATH being set for world 'includes' stage after r289438.
The 'includes' target is currently a pseudo target in bsd.subdir.mk that
does 'cd ${.CURDIR} && ${MAKE} buildincludes && ${MAKE} installincludes',
versus all over targets that just recurse.
In Makefile.inc1 the older duplicated bsd.subdir.mk logic for calling
'includes' was being executed in each subdir directly, meaning 'cd lib && make
includes' became 'cd lib && make buildincludes && make installincludes'. Now
that the bsd.subdir.mk logic is used it is calling 'make buildincludes && make
installincludes' from the top-level which pulls in the PATH=<default path>
from /Makefile.
The sub-make logic for 'includes' in bsd.subdir.mk was attempted to be removed
in r289282 but turned out to be wrong. I have a working version now but
it is not yet ready for commit. So for now in Makefile.inc1 split out
'includes' to 'buildincludes' and 'installincludes' which will avoid the
problem.
bdrewery [Sat, 17 Oct 2015 16:42:54 +0000 (16:42 +0000)]
Rework the 'make -n -n' feature such that '-n' recurses and '-N' does not.
Bmake has a documented feature of '-N' to skip executing commands which is
specifically intended for debugging top-level builds and not recursing into
sub-directories. This matches the older 'make -n' behavior we added which made
'-n -n' the recursing target and '-n' a non-recursing target.
Removing the '-n -n' feature allows the build to work as documented in
the bmake manpage with '-n' and '-N'. The older '-n -n' feature was also
not documented anywhere that I could see.
Note that the ${_+_} var is still needed as currently bmake incorrectly
executes '+' commands when '-N' is specified.
The '-n' and '-n -n' features were broken for several reasons prior to this.
r251748 made '_+_' never expand with '-n -n' which resulted in many
sub-directories not being visited until fixed 2 years later in r288391, and
many targets were given .MAKE over the past few years which resulted in
non-sub-make commands, such as rm and ln and mtree, to be executed.
This should also allow removing some indirection hacks in bsd.subdir.mk and
other cases of .USE that have a .MAKE by using '+'.
ngie [Sat, 17 Oct 2015 09:07:53 +0000 (09:07 +0000)]
Set dev->fd to -1 when calling cam_close_spec_device with a valid dev->fd
descriptor to avoid trashing valid file descriptors that access dev->fd at a
later point in time
ngie [Sat, 17 Oct 2015 08:39:37 +0000 (08:39 +0000)]
Integrate tools/regression/acltools into the FreeBSD test suite as tests/sys/acl
- Make the requirements more complete for the testcases
- Detect prerequisites so the tests won't fail (zfs.ko is loaded, zpool(1)
is available, ACL support is enabled with UFS, etc).
- Work with temporary files/directories/mountpoints that work with atf/kyua
- Limit the testcases to work on temporary filesystems to reduce tainting the
test host
bdrewery [Sat, 17 Oct 2015 05:57:29 +0000 (05:57 +0000)]
For 'buildenvvars' show any .exported variables as well to cover recent
exporting of OSRELDATE and VERSION. These already do export to 'buildenv'
fine.
bdrewery [Sat, 17 Oct 2015 05:55:45 +0000 (05:55 +0000)]
Always export VERSION to the environment to avoid looking it up again in
sub-makes.
Some of the world phases that used plain '${MAKE} -f Makefile.inc1' were not
passing this variable along which caused them to look it up again. By
using bmake's .export we can remove it from all of the other environment
lines.
Add a comment about the usage for VERSION for ctfmerge.
ngie [Sat, 17 Oct 2015 04:32:21 +0000 (04:32 +0000)]
Integrate tools/test/posixshm and tools/regression/posixshm into the FreeBSD
test suite as tests/sys/posixshm
Some other highlights:
- Convert the testcases over to ATF
- Don't use hardcoded paths to /tmp (which violate the ATF/kyua samdbox); use
mkstemp to generate temporary paths for non-SHM_ANON shm objects.
bdrewery [Sat, 17 Oct 2015 03:51:50 +0000 (03:51 +0000)]
Rework the world subdir build targets to use the standard SUBDIR_PARALLEL mechanism.
Back in r30113, the 'par-*' targets were added to parallelize portions of
the build in a very similar fashion as the SUBDIR_PARALLEL feature used in
r263778. Calling a target without 'par-' (for 'parallel') resulted in the
standard bsd.subdir.mk handling without parallelization. Given we have
SUBDIR_PARALLEL now there is no reason to duplicate the handling here.
In build logs this will result in the ${dir}.${target}__D targets now showing
as the normal ${target}_subdir_${dir} targets.
I audited all of the uses of Makefile.inc1 and Makefile's targets that use
bsd.subdir.mk and found that all but 'all' and 'install' were fine to use
as always parallel.
- For 'install' (from installworld -j) the ordering of lib/ and libexec/
before the rest of the system (described in r289433), and etc/ being last
(described in r289435), is all that matters. So now a .WAIT is added in
the proper places when invoking any 'install*' target. A parallel
installworld does work and took 46% of the time a non-parallel
install would take on my system with -j15 to ZFS.
- For 'all' I left the default handling for this to not run in parallel. A
'par-all' target is still used by the 'everything' stage of buildworld
to continue building in parallel as it already has been. This works
because most of the dependencies are handled by the early bootstrap
phases as well as 'libraries' and 'includes' phases. This lets
all of the SUBDIR build in parallel fine, such as bin/ and lib/. This
will not work if the user invokes 'all' though as we have dependencies
spread all over the system with no way to depend between them (except
for the dirdeps feature in the META_MODE build). Calling 'make all'
from the top-level is still useful at least when using SUBDIR_OVERRIDE.
bdrewery [Fri, 16 Oct 2015 23:53:37 +0000 (23:53 +0000)]
Fix adding manpages installed by LOCAL_DIRS to whatis file.
The ordering of 'etc' in the install has a long history dating back to the
first time it was realized it needed to be "last" in r4486. That commit
still left it before LOCAL_DIRS though. By having it before LOCAL_DIRS
any manpages they install were not being added to the whatis database in the
install image. They would likely show up in the file after a periodic
rebuild of the file though.
Currently the whatis file is built by an 'afterinstall' hook in etc/Makefile
that calls share/man's 'makedb' target.
bdrewery [Fri, 16 Oct 2015 21:09:15 +0000 (21:09 +0000)]
Correct a bitrotted comment about installworld order requirements.
The case of make(1) using a new /bin/sh issue was fixed in r173219 when ITOOLS
was introduced.
There are still issues with mid-install errors leaving a system unusable that
are currently non-trivial to solve. The safest ordering requires installing
rtld, libc and libthr (in that order) before anything else. We don't do that
now though. Much improvement is needed here still.
Discussed with: kip and kan (rtld/library ordering)
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
This is only a partial merge of respective ZFS infrastructure changes.
At this moment FreeBSD kernel has no those crypto algorithms, so the
parts of the code to enable them are commented out. When they are
implemented, it will be trivial to plug them in.
des [Fri, 16 Oct 2015 12:21:44 +0000 (12:21 +0000)]
Fix two bugs in HTTPS tunnelling:
- If the proxy returns a non-200 result, set the error code accordingly
so the caller / user gets a somewhat meaningful error message.
- Consume and discard any HTTP response header following the result line.
ed [Fri, 16 Oct 2015 10:26:15 +0000 (10:26 +0000)]
Use the right variable name.
MACHINE_CPUARCH expands to aarch64 for arm64, whereas MACHINE always
corresponds to the directory name under sys/ that contains the sources
for that architecture.
bdrewery [Fri, 16 Oct 2015 05:54:41 +0000 (05:54 +0000)]
Avoid warning race with creating 'ldscripts' directory during build.
In r204548 the 'rm -f ldscripts' was added likely due to reading the
conditional as 'else it is a file'. That seems unlikely though and
the more likely case is just that the directory hasn't been created yet.
Because this races with ,ssother scripts, use 'mkdir -p' which is a minimal
modification vs upstream to avoid the warning of 'File exists' if another
script creates it first. This could replace the 'test -d' as well but
then it's more unneeded change to the upstream script.
imp [Fri, 16 Oct 2015 03:06:02 +0000 (03:06 +0000)]
Do not relocate extents to make them contiguous if the underlying drive can do
deletions. Ability to do deletions is a strong indication that this
optimization will not help performance. It will only generate extra write
traffic. These devices are typically flash based and have a limited number of
write cycles. In addition, making the file contiguous in LBA space doesn't
improve the access times from flash devices because they have no seek time.
bdrewery [Fri, 16 Oct 2015 00:38:05 +0000 (00:38 +0000)]
Rename the /usr/share/doc/legal files to driver.LICENSE to work around
bug of installing 'realtek' and 'intel_iwn' as files rather then as
a 'LICENSE' file in their directories.
Also add obsolete entries for the older names and names that existed in head
for a period of time.
cem [Thu, 15 Oct 2015 23:45:43 +0000 (23:45 +0000)]
NTB: Add variable number MW, DB CB support code
This is a follow-up to r289208: "Xeon Errata Workaround."
Add logic to support a variable number of memory windows and doorbell
callbacks. This was added to the Linux driver in the "Xeon Errata
Workaround" commit, but I skipped it because it didn't look neccessary
at the time. It is needed for future Haswell split-BAR support, so
bring it in now.
A new tunable was added for if_ntb, 'hw.ntb.max_num_clients'. By
default, it is set to zero -- infer the number of clients from the
number of memory windows available from the hardware. Any other
positive value can specify a different number of clients, limited by the
number of doorbell callbacks available (4 under MSI-X, or 15 (Xeon) or
34 (SoC) under legacy INTx).
bdrewery [Thu, 15 Oct 2015 22:49:56 +0000 (22:49 +0000)]
Make installing to a non-existent directory an error.
Before this, if a file was installed to DESTDIR/some/dir and that directory
was missing due to not having ran 'make distrib-dirs' yet, the file would
be installed as 'some/dir'. For something like bsd.incs.mk with INCLUDEDIR
being a sub-directory of /usr/include, this could result in all of the headers
being installed to a file rather than getting a directory of them.
Now it will error that the file/directory does not exist rather than hide
the issue.
Another option being discussed is to implement GNU's install -D flag which
would auto create any missing directories.
This is a mitigation of the problem. The proper order to the build is to
run 'make distrib-dirs' first, but that can be forgotten if building from
a sub-directory after updating the source code to the latest revision.