Clang build bot now uses two-stage builds, and other LLVM/LLDB news


December 12, 2019 posted by Michał Górny

Upstream describes LLDB as a next generation, high-performance debugger. It is built on top of LLVM/Clang toolchain, and features great integration with it. At the moment, it primarily supports debugging C, C++ and ObjC code, and there is interest in extending it to more languages.

In February, I have started working on LLDB, as contracted by the NetBSD Foundation. So far I've been working on reenabling continuous integration, squashing bugs, improving NetBSD core file support, extending NetBSD's ptrace interface to cover more register types and fix compat32 issues, and fixing watchpoint support. In October 2019, I've finished my work on threading support (pending pushes) and fought issues related to upgrade to NetBSD 9.

November was focused on finally pushing the aforementioned patches and major buildbot changes. Notably, I was working on extending the test runs to compiler-rt which required revisiting past driver issues, as well as resolving new ones. More details on this below.

LLDB changes

Test updates, minor fixes

The previous month has left us with a few regressions caused by the kernel upgrade. I've done my best to figure out those I could reasonably fast; for the remaining ones Kamil suggested that I mark them XFAIL for now and revisit them later while addressing broken tests. This is what I did.

While implementing additional tests in the threading patches, I've discovered that the subset of LLDB tests dedicated to testing lldb-server behavior was disabled on NetBSD. I've reenabled lldb-server tests and marked failing tests appropriately.

After enabling and fixing those tests, I've implemented missing support in the NetBSD plugin for getting thread name.

I've also switched our process plugin to use the newer PT_STOP request over calling kill(). The main advantage of PT_STOP is that it reliably notifies about SIGSTOP via wait() even if the process is stopped already.

I've been able to reenable EOF detection test that was previously disabled due to bugs in the old versions of NetBSD 8 kernel.

Threading support pushed

After satisfying the last upstream requests, I was able to merge the three threading support patches:

  1. basic threading support,

  2. watchpoint support in threaded programs,

  3. concurrent watchpoint fixes.

This fixed 43 tests. It also triggered some flaky tests and a known regression and I'm planning to address them as the part of final bug cracking.

Build bot redesign

Recap of the problems

The tests of clang runtime components (compiler-rt, openmp) are performed using freshly built clang. This version of clang attempts to build and link C++ programs with libc++. However, our clang driver naturally requires system installation of libc++ — after all, we normally don't want the driver to include temporary build paths for regular executables! For this reason, building against fresh libc++ in build tree requires appropriate -cxx-isystem, -L and -Wl,-rpath flags.

So far, we managed to resolve this via using existing mechanisms to add additional flags to the test compiler calls. However, the existing solutions do not seem to suffice for compiler-rt. While technically I could work on adding more support code for that, I've decided it's better to look for a more general and permanent solution.

Two-stage builds

As part of the solution, I've proposed to switch our build bot to a two-stage build model. That is, firstly we're using the system GCC version to build a minimal functioning clang. Then, we're using this newly-built clang to build the whole LLVM suite, including another copy of clang.

The main advantage of this model is that we're verifying whether clang is capable of building a working copy of itself. Additionally, it insulates us against problems with host GCC. For example, we've experienced issues with GCC 8 and the default -O3. On the negative side, it increases build time significantly, especially that the second stage needs to be rebuilt from scratch every time.

A common practice in compiler world is to actually do three stages. In this case, it would mean building minimal clang with host compiler, then second stage with first stage clang, then third stage using second stage's clang. This would have the additional benefit of verifying that clang is capable of building a compiler that's fully capable of building itself. However, this seems to have little actual gain for us while it would increase the build time even more.

Compiler wrappers

Another interesting side effect of using the two-stage build model is that it proves an opportunity of injecting wrappers over clang and clang++ built in the first stage. Those wrappers allows us to add necessary -I, -L and -Wl,-rpath arguments without having to patch the driver for this special case.

Furthermore, I've used this opportunity to add experimental LLD usage to the first stage, and use it instead of GNU ld for the second stage. The LLVM linker has a significantly smaller memory footprint and therefore allows us to improve build efficiency. Sadly, proper LLD support for NetBSD still depends on patches that are waiting for upstream review.

Compiler-rt status and tests

The builds of compiler-rt have been reenabled for the build bot. I am planning to start enabling individual test groups (e.g. builtins, ASAN, MSAN, etc.) as I get them to work. However, there are still other problems to be resolved before that happens.

Firstly, there are new test regressions. Some of them seem to be specifically related to build layout changes, or to use of LLD as linker. I am currently investigating them.

Secondly, compiler-rt tests aim to test all supported multilib targets by default. We are currently preparing to enable compat32 in the kernel on the host running build bot and therefore achieve proper multilib suppor for running them.

Thirdly, ASAN, MSAN and TSAN are incompatible with ASLR (address space layout randomization) that is enabled by default on NetBSD. Furthermore, XRay is incompatible with W^X restriction.

Making tests work with PaX features

Previously, we've already addressed the ASLR incompatibility by adding an explicit check for it and bailing out if it's enabled. However, while this somehow resolves the problem for regular users, it means that the relevant tests can't be run on hosts having ASLR enabled.

Kamil suggested that we should use paxctl to disable ASLR per-executable here. This has the obvious advantage that it enables the tests to work on all hosts. However, it required injecting the paxctl invocation between the build and run step in relevant tests.

The ‘obvious’ solution to this problem would be to add a kind of %paxctl_aslr substitution that evaluates to paxctl call on NetBSD, and to : (no-op) on other systems. However, this required updating all the relevant tests and making sure that the invocation keeps being included in new tests.

Instead, I've noticed that the %run substitution is already using various kinds of wrappers for other targets, e.g. to run tests via an emulator. I went for a more agreeable solution of substituting %run in appropriate test suites with a tiny wrapper calling paxctl before executing the test.

Clang/LLD dependent libraries feature

Introduction to the feature

Enabling the two stage builds had also another side effect. Since stage 2 build is done via clang+LLD, a newly added feature of dependent libraries got enabled and broke our build.

Dependent libraries are a feature permitting source files to specify additional libraries that are afterwards injected into linker's invocation. This is done via a #pragma originally used by MSVC. Consider the following example:

#include <stdio.h>
#include <math.h>
#pragma comment(lib, "m")

int main() {
    printf("%f\n", pow(2, 4.3));
    return 0;
}

When the source file is compiled using Clang on an ELF target, the lib comments are converted into .deplibs object section:

$ llvm-readobj -a --section-data test.o
[...]
  Section {
    Index: 6
    Name: .deplibs (25)
    Type: SHT_LLVM_DEPENDENT_LIBRARIES (0x6FFF4C04)
    Flags [ (0x30)
      SHF_MERGE (0x10)
      SHF_STRINGS (0x20)
    ]
    Address: 0x0
    Offset: 0x94
    Size: 2
    Link: 0
    Info: 0
    AddressAlignment: 1
    EntrySize: 1
    SectionData (
      0000: 6D00                                 |m.|
    )
  }
[...]

When the objects are linked into a final executable using LLD, it collects all libraries from .deplibs sections and links to the specified libraries.

The example program pasted above would have to be built on systems requiring explicit -lm (e.g. Linux) via:

$(CC) ... test.c -lm

However, when using Clang+LLD, it is sufficient to call:

clang -fuse-ld=lld ... test.c

and the library is included automatically. Of course, this normally makes little sense because you have to maintain compatibility with other compilers and linkers, as well as old versions of Clang and LLD.

Use of LLVM to approach static library dependency problem

LLVM started using the deplibs feature internally in D62090 in order to specify linkage between runtimes and their dependent libraries. Apparently, the goal was to provide an in-house solution to the static library dependency problem.

The problem discussed is that static libraries on Unix-derived platforms are primitive archives containing object files. Unlike shared libraries, they do not contain lists of other libraries they depend on. As a result, when linking against a static library, the user needs to explicitly pass all the dependent libraries to the linker invocation.

Over years, a number of workarounds were proposed to relieve the user (or build system) from having to know the exact dependencies of the static libraries used. A few worth noting include:

  • libtool archives (.la) used by libtool as generic wrappers over shared and static libraries,

  • library-specific *-config programs and pkg-config files, providing options for build systems to utilize,

  • GNU ld scripts that can be used in place of libraries to alter linker's behavior.

The first two solutions work at build system level, and therefore are portable to different compilers and linkers. The third one requires linker support but have been used successfully to some degree due to wide deployment of GNU binutils, as well as support in other linkers (e.g. LLD).

Dependent libraries provide yet another attempt to solve the same problem. Unlike the listed approaches, it is practically transparent to the static library format — at the cost of requiring both compiler and linker support. However, since the runtimes are normally supposed to be used by Clang itself, at least the first of the points can be normally assumed to be satisfied.

Why it broke NetBSD?

After all the lengthy introduction, let's get to the point. As a result of my changes, the second stage is now built using Clang/LLD. However, it seems that the original change making use of deplibs in runtimes was tested only on Linux — and it caused failures for us since it implicitly appended libraries not present on NetBSD.

Over time, users of a few other systems have added various #ifdefs in order to exclude Linux-specific libraries from their systems. However, this solution is hardly optimal. It requires us to maintain two disjoint sets of rules for adding each library — one in CMake for linking of shared libraries, and another one in the source files for emitting dependent libraries.

Since dependent libraries pragmas are present only in source files and not headers, I went for a different approach. Instead of using a second set of rules to decide which libraries to link, I've exported the results of CMake checks into -D flags, and made dependent libraries conditional on CMake check results.

Firstly, I've fixed deplibs in libunwind in order to fix builds on NetBSD. Afterwards, per upstream's request I've extended the deplibs fix to libc++ and libc++abi.

Future plans

I am currently still working on fixing regressions after the switch to two-stage build. As things develop, I am also planning to enable further test suites there.

Furthermore, I am planning to continue with the items from the original LLDB plan. Those are:

  1. Add support to backtrace through signal trampoline and extend the support to libexecinfo, unwind implementations (LLVM, nongnu). Examine adding CFI support to interfaces that need it to provide more stable backtraces (both kernel and userland).

  2. Add support for i386 and aarch64 targets.

  3. Stabilize LLDB and address breaking tests from the test suite.

  4. Merge LLDB with the base system (under LLVM-style distribution).

This work is sponsored by The NetBSD Foundation

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

https://netbsd.org/donations/#how-to-donate

[1 comment]

 



Comments:

When is support for i386 planned?

Posted by Piotr on January 11, 2020 at 08:36 PM UTC #

Post a Comment:
Comments are closed for this entry.