NetBSD entering 2019 with more complete LLVM support


December 30, 2018 posted by Kamil Rytarowski

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. As you can read in my previous report, I've been focusing on fixing build and test failures for the purpose of improving the buildbot coverage.

Previously, I've resolved test failures in LLVM, Clang, LLD, libunwind, openmp and partially libc++. During the remainder of the month, I've been working on the remaining libc++ test failures, improving the NetBSD clang driver and helping Kamil Rytarowski with compiler-rt.

Locale issues in NetBSD / libc++

The remaining libc++ work focused on resolving locale issues. This consisted of two parts:

  1. Resolving incorrect assumptions in re.traits case-insensitivity translation handling: r349378 (D55746),

  2. Enabling locale support and disabling tests failing because of partial locale support in NetBSD: r349379 (D55767).

The first of the problems was related to testing the routine converting strings to common case for the purpose of case-insensitive regular expression matching. The test attempted to translate \xDA and \xFA characters (stored as char) in UTF-8 locale, and expected both of them to map to \xFA, i.e. according to Unicode Latin-1 Supplement. However, this behavior is only exhibited on OSX. Other systems, including NetBSD, FreeBSD and Linux return the character unmodified (i.e. \xDA and \xFA appropriately).

I've came to the conclusion that the most likely cause of the incompatible behavior is that both \xDA and \xFA alone do not comprise valid UTF-8 sequences. However, since the translation function can take only a single char and is therefore incapable of processing multi-byte sequences, it is unclear how it should handle the range \x80..\xFF that's normally used for multi-byte sequences or not used at all. Apparently, the OSX implementers decided to map it into \u0080..\u00FF range, i.e. treat equivalently to wchar_t, while others decided to ignore it.

After some discussion, upstream agreed on removing the two tests for now, and treating the result of this translation as undefined implementation-specific behavior. At the same time, we've left similar cases for L'\xDA' and L'\xFA' which seem to work reliably on all implementations (assuming wchar_t is using UCS-4, UCS-2 or UTF-16 encoding).

The second issue was adding libc++ target info for NetBSD which has been missing so far. This I've based on the rules for FreeBSD, modified to use libc++abi instead of libcxxrt (the former being upstream default, and the latter being abandoned external project). This also included a list of supported locales which caused a large number of additional tests to be run (if I counted correctly, 43 passing and 30 failing).

Some of the tests started failing due to limited locale support on NetBSD. Apparently, the system supports locales only for the purpose of character encoding (i.e. LC_CTYPE category), while the remaining categories are implemented as stubs (see localeconv(3)). After some thinking, we've agreed to list all locales expected by libc++ as supported since NetBSD has support files for them, and mark relevant tests as expected failures. This has the advantage that when locale support becomes more complete, the expected failures will be reported as unexpected passes, and we will know to update the tests accordingly.

Clang driver updates

On specific request of Kamil, I have looked into the NetBSD Clang driver code afterwards. My driver work focused on three separate issues:

  1. Recently added address significance tables (-faddrsig) causing crashes of binutils (r349647; D55828),

  2. Passing -D_REENTRANT when building sanitized code (with prerequisite: r349649, D55832; r349650, D55654,

  3. Establishing proper support for finding and using locally installed LLVM components, i.e. libc++, compiler-rt, etc.

The first problem was mostly a side effect of Kamil testing my z3 bump. He noticed that binutils crash on executables produced by recent versions of clang (example backtrace <http://netbsd.org/~kamil/llvm/strip.txt>). Given my earlier experiences in Gentoo, I correctly guessed that this is caused by address significance table support that is enabled by default in LLVM 7. While technically they should be ignored by tooling not supporting them, it causes verbose warnings in some versions of binutils, and explicit crashes in the 2.27 version used by NetBSD at the moment.

In order to prevent users from experiencing this, we've agreed to disable address significance table by default on NetBSD (users can still explicitly enable them via -faddrsig). What's interesting, the block for disabling LLVM_ADDRSIG has grown since to include PS4 and Gentoo which indicates that we're not the only ones considering this default a bad idea.

The second problem was that Kamil indicated that we should only support sanitizing reentrant versions of the system API. Accordingly, he requested that the driver automatically includes -D_REENTRANT when compiling code with any of the sanitizers enabled. I've implemented a new internal clang API that checks whether any of the sanitizers were enabled, and used it to implicitly pass this option from driver to the actual compiler instance.

The third problem is much more complex. It boils down to the driver using hardcoded paths from a few years back assuming LLVM being installed to /usr as path of /usr/src integration. However, those paths do not really work when clang is run from the build directory or installed manually to /usr/local; which means e.g. clang install with libc++ does not work out of the box.

Sadly, we haven't been able to come up with a really good and reliable solution to this. Even if we could make clang find other components reasonably reliably using executable-relative paths, this would require building executables with RPATHs potentially pointing to (temporary) build directory of LLVM. Eventually, I've decided to abandon the problem and focus on passing appropriate options explicitly when using just-built clang to build and test other components. For the purpose of buildbot, this required using the following option:

-DOPENMP_TEST_FLAGS="-cxx-isystem${PWD}/include/c++/v1"

compiler-rt work

As Kamil is finalizing his work on sanitizers, he left tasks in TODO lists and I picked them up to work on them.

Build fixes

My first two fixes to compiler-rt were plain build fixes, necessary to proceed further. Those were:

  1. Fixing use of variables (whose values could not be directly determined at compile time) for array length: r349645, D55811.

  2. Detecting missing libLLVMTestingSupport and skipping tests requiring it: r349899, D55891.

The first one was a trivial coding slip that was easily fixed by using a constant value for hash length. The second one was more problematic.

Two tests in XRay test suite required LLVMTestingSupport. However, this library is not installed by LLVM, and so is not present when building stand-alone. Normally, similar issues in LLVM were resolved by building the relevant library locally (that's e.g. what we do with gtest). In this case this wasn't feasible due to the library being more tightly coupled with LLVM itself, and adjusting the build system would be non-trivial. Therefore, since only two tests required this we've agreed on disabling this when the library isn't present.

XRay: alignment and MPROTECT problems

The next step was to research XRay test failures. Firstly, all the tests were failing due to PaX MPROTECT. Secondly, after disabling MPROTECT I've been getting the following test failures from check-xray:

XRay-x86_64-netbsd :: TestCases/Posix/fdr-reinit.cc
XRay-x86_64-netbsd :: TestCases/Posix/fdr-single-thread.cc

Both tests segfaulted on initializing thread-local data structure. I've came to the conclusion that somehow initializing aligned thread-local data is causing the issue, and built a simple test case for it:

struct FDRLogWriter {
  FDRLogWriter() {}
};

struct alignas(64) ThreadLocalData {
  FDRLogWriter Buffer{};
};

void f() { thread_local ThreadLocalData foo{}; }

int main() {
  f();
  return 0;
}

The really curious part of this was that the code produced by g++ worked fine, while the one produced by clang++ segfaulted. At the same time, the LLVM bytecode generated by clang worked fine on both FreeBSD and Linux. Finally, through comparing and manipulating LLVM bytecode I've found the culprit: the alignment was not respected while allocating storage for thread-local data. However, clang assumed it will be and used MOVAPS instructions that caused a segfault on unaligned data.

I wrote about the problem to tech-toolchain and received a reply that TLS alignment is mostly ignored and there is no short term plan for fixing it. As an interim solution, I wrote a patch that disables alignment tips sufficiently to prevent clang from emitting code relying on it: r350029, D56000.

As suggested by Kamil, I discussed the PaX MPROTECT issues with upstream. The direct problem in solving it the usual way is that the code relies on remapping program memory mapped by ld.so, and altering the program code in place. I've been informed that this design was chosen to keep the mechanism simple, and explicitly declaring that XRay is incompatible with system security features such as PaX MPROTECT. As Kamil suggested, I've written an explicit check for MPROTECT being enabled, and made XRay fail with explanatory error message in that case: r350030, D56049.

Improving stdio.h / FILE* interceptor coverage

My final work has been on improving stdio.h coverage in interceptors. It started as a task on adding NetBSD FILE structure support for interceptors (D56109), and expanded into increasing test and interceptor coverage, both for POSIX and NetBSD-specific stdio.h functions.

I have implemented tests for the following functons:

  • clearerr, feof, ferrno, fileno, fgetc, getc, ungetc: D56136,

  • fputc, putc, putchar, getc_unlocked, putc_unlocked, putchar_unlocked: D56152,

  • popen, pclose: D56153,

  • funopen, funopen2 (in multiple parameter variants): D56154.

Furthermore, I have implemented missing interceptors for the following functions:

  • popen, popenve, pclose: D56157,

  • funopen, funopen2: D56158.

Kamil also pointed out that devname_r interceptor has wrong return type on non-NetBSD systems, and I've fixed that for completeness: D56150.

Summary

At this point, NetBSD LLVM buildbot is building and testing the following projects with no expected test failures:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

It also builds the lldb debugger but does not run its test suite.

The support for compiler-rt is progressing but it is not ready for buildbot inclusion yet.

Future plans

Next month, I'm planning to work on next items from the TODO. Most notably, this includes:

  • building compiler-rt on buildbot and running the tests,

  • porting LLD to actually produce working executables on NetBSD,

  • porting remaining compiler-rt components: DFSan, ESan, LSan, shadowcallstack.

[0 comments]

 



Post a Comment:
Comments are closed for this entry.