The NetBSD support update before the LLVM-8.0 branching point


December 16, 2018 posted by Kamil Rytarowski

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. My first task in this endeavor was to fix build and test issues in as many LLVM projects as timely possible, and get them all covered by the NetBSD LLVM buildbot.

Including more projects in the continuous integration builds is important as it provides the means to timely catch regressions and new issues in NetBSD support. It is not only beneficial because it lets us find offending commits easily but also because it makes other LLVM developers aware of NetBSD porting issues, and increases the chances that the patch authors will fix their mistakes themselves.

Initial buildbot setup and issues

The buildbot setup used by NetBSD is largely based on the LLDB setup used originally by Android, published in the lldb-utils repository. For the purpose of necessary changes, I have forked it as netbsd-llvm-build and Kamil Rytarowski has updated the buildbot configuration to use our setup.

Initially, the very high memory use in GNU ld combined with high job count caused our builds to swap significantly. As a result, the builds were very slow and frequently were terminated due to no output as buildbot presumed them to hang. The fix for this problem consisted of two changes.

Firstly, I have extended the building script to periodically report that it is still active. This ensured that even during prolonged linking buildbot would receive some output and would not terminate the build prematurely.

Secondly, I have split the build task into two parts. The first part uses full ninja job count to build all static libraries. The second part runs with reduced job count to build everything else. Since most of LLVM source files are part of static libraries, this solution makes it possible to build as much as possible with full job count, while reducing it necessarily for GNU ld invocations later.

While working on this setup, we have been informed that the buildbot setup based on external scripts is a legacy design, and that it would be preferable to update it to define buildbot rules directly. However, we have agreed to defer that until our builds mature, as external scripts are more flexible and can be updated without having to request a restart of the LLVM buildbot.

The NetBSD buildbot is part of LLVM buildbot setup, and can be accessed via http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd8. The same machine is also used to run GDB and binutils build tests.

RPATH setup for LLVM builds

Another problem that needed solving was to fix RPATH in built executables to include /usr/pkg/lib, as necessary to find dependencies installed via pkgsrc. Normally, the LLVM build system sets RPATH itself, using a path based on $ORIGIN. However, I have been informed that NetBSD discourages the use of $ORIGIN, and appropriately I have been looking for a better solution.

Eventually, after some experimentation I have come up with the following CMake parameters:

-DCMAKE_BUILD_RPATH="${PWD}/lib;/usr/pkg/lib"
-DCMAKE_INSTALL_RPATH=/usr/pkg/lib

This explicitly disables the standard logic used by LLVM. Build-time RPATH includes the build directory explicitly as to ensure that freshly built shared libraries will be preferred at build time (e.g. when running tests) over previous pkgsrc install; this directory is afterwards removed from rpath when installing.

Building and testing more LLVM sub-projects

The effort so far was to include the following projects in LLVM buildbot runs:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

  • lldb: debugger; built without test suite at the moment

Additionally, the following project was considered but it was ultimately skipped as it was not ready for wider testing yet:

  • llgo: Go compiler

My project fixes

During my work, I have been trying to upstream all the necessary changes ASAP, as to avoid creating additional local patch maintenance burden. This section provides a short list of all patches that have either been merged upstream, or are in process of waiting for review.

LLVM

Waiting for upstream review:

pkgsrc

  • z3 version bump (submitted to maintainer, waiting for reply)

NetBSD portability

During my work, I have met with a few interesting divergencies between the assumptions made by LLVM developers and the actual behavior of NetBSD. While some of them might be considered bugs, we determined it was preferable to support the current behavior in LLVM. In this section I shortly describe each of them, and indicate the path I took in making LLVM work.

unwind.h

The problem with unwind.h header is a part of bigger issue — while the unwinder API is somewhat defined as part of system ABI, there is no well-defined single implementation on most of the systems. In practice, there are multiple implementations both of the unwinding library and of its headers:

  • gcc: it implements unwinder library in libgcc; also, has its own unwind.h on Linux (but not on NetBSD)

  • clang: it has its own unwind.h (but no library)

  • 'non-GNU' libunwind: stand-alone implementation of library and headers

  • llvm-libunwind: stand-alone implementation of library and headers

  • libexecinfo: provides unwinder library and unwind.h on NetBSD

Since gcc does not provide unwind.h on NetBSD, using it to build LLVM normally results in the built-in unwinder library from GCC being combined with unwind.h installed as part of libexecinfo. However, the API defined by the latter header is type-incompatible with most of the other implementations, and caused libc++abi build to fail.

In order to resolve the build issue, we agreed to use LLVM's own unwinder implementation (llvm-libunwind) which we were building anyway, via the following CMake option:

-DLIBCXXABI_USE_LLVM_UNWINDER=ON

I have started a thread about fixing unwind.h to be more compatible.

noatime behavior

noatime is a filesystem mount option that is meant to inhibit atime updates on file accesses. This is usually done in order to avoid spurious inode writes when performing read-level operations. However, unlike the other implementations NetBSD not only disables automatic atime updates but also explicitly blocks explicit updates via utime() family of functions.

Technically, this behavior is permitted by POSIX as it permits implementation-defined behavior on updating atimes. However, a small number of LLVM tests explicitly rely on being able to set atime on a test file, and the NetBSD behavior causes them to fail. Without a way to set atime, we had to mark those tests unsupported.

I have started a thread about noatime behavior on tech-kern.

__func__ value

__func__ is defined by the standard to be an arbitrary form of function identifier. On most of the other systems, it is equal to the value of __FUNCTION__ defined by gcc, that is the undecorated function name. However, NetBSD system headers conditionally override this to __PRETTY_FUNCTION__, that is a full function prototype.

This has caused one of the LLVM tests to fail due to matching debug output. Admittedly, this was definitely a problem with the test (since __func__ can have an arbitrary value) and I have fixed it to permit the pretty function form.

Kamil Rytarowski has noted that the override is probably more accidental than expected since the header was not updated for C++11 compilers providing __func__, and started a thread about disabling it.

tar -t output

Another difference I have noted while investigating test failures was in output of tar -t (listing files inside a tarball). Curious enough, both GNU tar and libarchive use C-style escapes in the file list output. NetBSD pax/tar output the filenames raw.

The test meant to verify whether backslash in filenames is archived properly (i.e. not treated equivalent to forward slash). It failed because it expected the backslash to be escaped. I was able to fix it by permitting both forms, as the exact treatment of backslash was not relevant to the test case at hand.

I have compared different tar implementations including NetBSD pax in the article portability of tar features.

(time_t)-1 meaning

One of the libc++ test cases was verifying the handling of negative timestamps using a value of -1 (i.e. one second before the epoch). However, this value seems to be mishandled in some of the BSD implementations, FreeBSD and NetBSD in particular. Curious enough, other negative values work fine.

The easier side of the issue is that some functions (e.g. mktime()) use -1 as an error value. However, this can be easily fixed by inspecting errno for actual errors.

The harder side is that the kernel uses a value of -1 (called ENOVAL) to internally indicate that the timestamp is not to be changed. As a result, an attempt to update the file timestamp to one second before the epoch is going to be silently ignored.

I have fixed the test via extend the FreeBSD workaround to NetBSD, and using a different timestamp. I have also started a thread about (time_t)-1 handling on tech-kern.

Future plans

The plans for the remainder of December include, as time permits:

  • finishing upstream of the fore-mentioned patches

  • fixing flaky tests on NetBSD buildbot

  • upstreaming (and fixing if necessary) the remaining pkgsrc patches

  • improving NetBSD support in profiling and xray (of compiler-rt)

  • porting ESan/DFSan

The long-term goals include:

  • improving support for __float128

  • porting LLD to NetBSD (currently it passes all tests but does not produce working executables)

  • finishing LLDB port to NetBSD

  • porting remaining sanitizers to NetBSD

[0 comments]

 



Post a Comment:
  • HTML Syntax: NOT allowed