The NetBSD support update before the LLVM-8.0 branching point
Prepared by Michał Górny (mgorny AT gentoo.org).
I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. My first task in this endeavor was to fix build and test issues in as many LLVM projects as timely possible, and get them all covered by the NetBSD LLVM buildbot.
Including more projects in the continuous integration builds is important as it provides the means to timely catch regressions and new issues in NetBSD support. It is not only beneficial because it lets us find offending commits easily but also because it makes other LLVM developers aware of NetBSD porting issues, and increases the chances that the patch authors will fix their mistakes themselves.
Initial buildbot setup and issues
The buildbot setup used by NetBSD is largely based on the LLDB setup used originally by Android, published in the lldb-utils repository. For the purpose of necessary changes, I have forked it as netbsd-llvm-build and Kamil Rytarowski has updated the buildbot configuration to use our setup.
Initially, the very high memory use in GNU ld combined with high job count caused our builds to swap significantly. As a result, the builds were very slow and frequently were terminated due to no output as buildbot presumed them to hang. The fix for this problem consisted of two changes.
Firstly, I have extended the building script to periodically report that it is still active. This ensured that even during prolonged linking buildbot would receive some output and would not terminate the build prematurely.
Secondly, I have split the build task into two parts. The first part uses full ninja job count to build all static libraries. The second part runs with reduced job count to build everything else. Since most of LLVM source files are part of static libraries, this solution makes it possible to build as much as possible with full job count, while reducing it necessarily for GNU ld invocations later.
While working on this setup, we have been informed that the buildbot setup based on external scripts is a legacy design, and that it would be preferable to update it to define buildbot rules directly. However, we have agreed to defer that until our builds mature, as external scripts are more flexible and can be updated without having to request a restart of the LLVM buildbot.
The NetBSD buildbot is part of LLVM buildbot setup, and can be accessed via http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd8. The same machine is also used to run GDB and binutils build tests.
RPATH setup for LLVM builds
Another problem that needed solving was to fix RPATH in built executables to include /usr/pkg/lib, as necessary to find dependencies installed via pkgsrc. Normally, the LLVM build system sets RPATH itself, using a path based on $ORIGIN. However, I have been informed that NetBSD discourages the use of $ORIGIN, and appropriately I have been looking for a better solution.
Eventually, after some experimentation I have come up with the following CMake parameters:
-DCMAKE_BUILD_RPATH="${PWD}/lib;/usr/pkg/lib" -DCMAKE_INSTALL_RPATH=/usr/pkg/lib
This explicitly disables the standard logic used by LLVM. Build-time RPATH includes the build directory explicitly as to ensure that freshly built shared libraries will be preferred at build time (e.g. when running tests) over previous pkgsrc install; this directory is afterwards removed from rpath when installing.
Building and testing more LLVM sub-projects
The effort so far was to include the following projects in LLVM buildbot runs:
llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool
clang: C/C++ compiler
clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)
lld: link editor
polly: loop and data-locality optimizer for LLVM
openmp: OpenMP runtime library
libunwind: unwinder library
libcxxabi: low-level support library for libcxx
libcxx: implementation of C++ standard library
lldb: debugger; built without test suite at the moment
Additionally, the following project was considered but it was ultimately skipped as it was not ready for wider testing yet:
llgo: Go compiler
My project fixes
During my work, I have been trying to upstream all the necessary changes ASAP, as to avoid creating additional local patch maintenance burden. This section provides a short list of all patches that have either been merged upstream, or are in process of waiting for review.
LLVM
fixed tests to work without explicit /usr/pkg/bin/python: r348095
disabled tests failing due to touch -a not working on filesystems mounted with noatime option: r348354, r348355
Waiting for upstream review:
Clang
Waiting for upstream review:
LLD
openmp
libcxx
worked around test failing due to tv_sec=-1 not working: r348967
XFAIL-ed uchar.h test as the header is not present on NetBSD: r348973
Waiting for upstream review:
pkgsrc
z3 version bump (submitted to maintainer, waiting for reply)
NetBSD portability
During my work, I have met with a few interesting divergencies between the assumptions made by LLVM developers and the actual behavior of NetBSD. While some of them might be considered bugs, we determined it was preferable to support the current behavior in LLVM. In this section I shortly describe each of them, and indicate the path I took in making LLVM work.
unwind.h
The problem with unwind.h header is a part of bigger issue — while the unwinder API is somewhat defined as part of system ABI, there is no well-defined single implementation on most of the systems. In practice, there are multiple implementations both of the unwinding library and of its headers:
gcc: it implements unwinder library in libgcc; also, has its own unwind.h on Linux (but not on NetBSD)
clang: it has its own unwind.h (but no library)
'non-GNU' libunwind: stand-alone implementation of library and headers
llvm-libunwind: stand-alone implementation of library and headers
libexecinfo: provides unwinder library and unwind.h on NetBSD
Since gcc does not provide unwind.h on NetBSD, using it to build LLVM normally results in the built-in unwinder library from GCC being combined with unwind.h installed as part of libexecinfo. However, the API defined by the latter header is type-incompatible with most of the other implementations, and caused libc++abi build to fail.
In order to resolve the build issue, we agreed to use LLVM's own unwinder implementation (llvm-libunwind) which we were building anyway, via the following CMake option:
-DLIBCXXABI_USE_LLVM_UNWINDER=ON
I have started a thread about fixing unwind.h to be more compatible.
noatime behavior
noatime is a filesystem mount option that is meant to inhibit atime updates on file accesses. This is usually done in order to avoid spurious inode writes when performing read-level operations. However, unlike the other implementations NetBSD not only disables automatic atime updates but also explicitly blocks explicit updates via utime() family of functions.
Technically, this behavior is permitted by POSIX as it permits implementation-defined behavior on updating atimes. However, a small number of LLVM tests explicitly rely on being able to set atime on a test file, and the NetBSD behavior causes them to fail. Without a way to set atime, we had to mark those tests unsupported.
I have started a thread about noatime behavior on tech-kern.
__func__ value
__func__ is defined by the standard to be an arbitrary form of function identifier. On most of the other systems, it is equal to the value of __FUNCTION__ defined by gcc, that is the undecorated function name. However, NetBSD system headers conditionally override this to __PRETTY_FUNCTION__, that is a full function prototype.
This has caused one of the LLVM tests to fail due to matching debug output. Admittedly, this was definitely a problem with the test (since __func__ can have an arbitrary value) and I have fixed it to permit the pretty function form.
Kamil Rytarowski has noted that the override is probably more accidental than expected since the header was not updated for C++11 compilers providing __func__, and started a thread about disabling it.
tar -t output
Another difference I have noted while investigating test failures was in output of tar -t (listing files inside a tarball). Curious enough, both GNU tar and libarchive use C-style escapes in the file list output. NetBSD pax/tar output the filenames raw.
The test meant to verify whether backslash in filenames is archived properly (i.e. not treated equivalent to forward slash). It failed because it expected the backslash to be escaped. I was able to fix it by permitting both forms, as the exact treatment of backslash was not relevant to the test case at hand.
I have compared different tar implementations including NetBSD pax in the article portability of tar features.
(time_t)-1 meaning
One of the libc++ test cases was verifying the handling of negative timestamps using a value of -1 (i.e. one second before the epoch). However, this value seems to be mishandled in some of the BSD implementations, FreeBSD and NetBSD in particular. Curious enough, other negative values work fine.
The easier side of the issue is that some functions (e.g. mktime()) use -1 as an error value. However, this can be easily fixed by inspecting errno for actual errors.
The harder side is that the kernel uses a value of -1 (called ENOVAL) to internally indicate that the timestamp is not to be changed. As a result, an attempt to update the file timestamp to one second before the epoch is going to be silently ignored.
I have fixed the test via extend the FreeBSD workaround to NetBSD, and using a different timestamp. I have also started a thread about (time_t)-1 handling on tech-kern.
Future plans
The plans for the remainder of December include, as time permits:
finishing upstream of the fore-mentioned patches
fixing flaky tests on NetBSD buildbot
upstreaming (and fixing if necessary) the remaining pkgsrc patches
improving NetBSD support in profiling and xray (of compiler-rt)
porting ESan/DFSan
The long-term goals include:
improving support for __float128
porting LLD to NetBSD (currently it passes all tests but does not produce working executables)
finishing LLDB port to NetBSD
porting remaining sanitizers to NetBSD