LLVM asan and ubsan on NetBSD


July 03, 2017 posted by Kamil Rytarowski

Over the last 30 days I was focusing on getting the environment to enable LLVM sanitizers and the Clang compiler on NetBSD. Meanwhile I pushed forward generic parts that were needing enhancements around pkgsrc and LLVM in general to ease the future LLDB work.

dogfood

When I have realized that in order to work on the LLVM sanitizers I need to use Clang as the compiler. A part of the compiler-rt (lowlevel LLVM compiler runtime library) has code specifically incompatible with GCC. It was mainly related to intrinsic instructions for atomic functions. I tried to research how much work is needed to port it to GCC. It happened to be non-trivial and I filed a bug on the LLVM bugzilla.

These circumstances made me to switch to Clang as the pkgsrc toolchain. I was using it to test the compilation of small bulks of packages and record build and compiler problems. To save time, I used ccache as my cache for builds.

My options in mk.conf(5):

PKGSRC_COMPILER=        ccache clang
CCACHE_BASE=            /usr/local
CCACHE_DIR=             /public/ccache_tmp
CCACHE_LOGFILE=         /tmp/ccache.txt
PKG_CC=                 clang
PKG_CXX=                clang++
CLANGBASE=              /usr/local
HAVE_LLVM=              yes

It's worth noting that ccache with pkgsrc won't check $HOME/.ccache for configuration, it must be placed in $CCACHE_DIR/ccache.conf.

The documented problems can be summarized as:

  • Broken ccache in pkgsrc for C++11 packages.
    The pkgsrc framework started supporting C++ languages in the USE_LANGUAGES definition. Packages with newly added USE_LANGUAGES values (such as c++11) were not compiled with ccache because ccache.mk was not yet aware of such values. This broke support of these packages to set these values to be built with ccache. I've corrected it and introduced a new option CCACHE_LOGFILE to more easily track execution of ccache and detect errors.
  • Broken ccache in pkgsrc for a custom toolchain.
    ccache tries finding a real-compiler looking for it in $PATH. When I have built clang within pkgsrc to work on it (installed into /usr/pkg), and I had my main toolchain in /usr/local it was picking the one from /usr/pkg for new builds and it resulted in cache-misses for new builds. I have installed a fix for it to pass ccache specific PATH to always find the appropriate compiler.
  • Header <execinfo.h> cannot be included on Clang 5.0.0svn.
    For some reason compilers tend to install their own headers that overshadow the system headers. This resulted in build failures in programs including plain <execinfo.h> header (for the backtrace(3) function). This system header used to include <stddef.h> that included our <sys/cdefs.h>... with shadowed <stddef.h> by Clang 5.0.0svn (from $PREFIX/lib/clang/5.0.0/include/stddef.h). Christos Zoulas fixed it by making <execinfo.h> standalone and independent from standard libc headers.
  • __float128 and GNU libstdc++.
    Our basesystem GNU libstdc++ enables __float128 on i386, amd64 and i64 ports. As of now the LLVM equivalent library contains partial support for this type. This results in a problem that affects 3rd party programs in the setup of Clang + libstdc++ detect __float128 support and break because the compiler does not define appropriate global define __FLOAT128__. This issue is still open for discussion on how to solve it for NetBSD.
  • gforth optimization problems.
    Upstream gforth developers ported this FORTH compiler to Clang, and triggered an optimization issue with attempting to needlessly solve a complex internal problem. This results with compilation times of several minutes on a modern CPUs instead of getting the results immediately. The problem has been already reported on the LLVM bugzilla and I have filed a report that it is still valid.
  • bochs can be built with clang.
    A while ago, bochs was buildable only by the GCC compilers and the Clang toolchain was blacklisted. I have verified that this is no longer the case and unmasked the package for compilers other than GCC.

LLVM and Clang testsuites

I have prepared Clang and LLVM testsuites to execute on NetBSD. Correctness of both projects is crucial for LLDB and the LLVM sanitizers to work because their issues resound problems inside programs that depend on them. Originally I have corrected the tests with local patches to build with GCC, and switched later to Clang. I have restructured the packages in pkgsrc-wip in order to execute the test-suite. I have fixed 20 test failures in LLVM implementing AllocateRWX and ReleaseRWX for the NetBSD flavor of PaX MPROTECT. There are still over 200 failures to solve!

It's worth noting that the googletest library (used in a modified version in LLVM and in a regular one in Clang) finally accepted the NetBSD patches.

LLVM asan and ubsan

I expect to get four LLVM sanitizers working in order to move on to LLDB: asan (address sanitizer), ubsan (undefined behavior sanitizer), tsan (thread sanitizer), msan (memory sanitizer). The other ones like dfsan (data-flow sanitizer) or lsan (leak sanitizer) are currently to be skipped. In general, sanitizers are part of the LLDB functionality that I want to get aboard on NetBSD, as there are plugins to integrate them within the debugger. In the current state I require them to debug bugs inside LLDB/NetBSD.

The original work on sanitizers in GCC (with libsanitizer) has been done by Christos Zoulas. GCC libsanitizer is a close sibling of compiler-rt/lib from the LLVM project. I picked up his work and integrated it into compiler-rt and developed the rest (code differences, fixing bugs, Clang/LLVM specific parts in llvm/ and clang/) and I managed to get asan and ubsan to work.

Users should pickup pkgsrc-wip in revision 3e7c52b97b4d6cb8ea69a081409ac818c812c34a and install wip/{llvm,clang,compiler-rt}-netbsd. Clang will be ready for usage:

/usr/pkg/bin/clang -fsanitize=undefined test.c
and
/usr/pkg/bin/clang -fsanitize=address test.c

Additional compiler commands that may improve the experience:

-g -O0 -fno-omit-frame-pointer -pie -fPIE

These sanitizers are not production ready and are active under development.

Plan for the next milestone

Roadmap for the next month:

  • Finish and send upstream LLVM asan and ubsan support.
  • Correct more problems triggered by LLVM and Clang test-suites.
  • Resume msan and tsan porting.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate [1 comment]

 



Comments:

Quite helpfull indeed, thanks a lot for this post guys !

Posted by Bica100 on August 30, 2017 at 06:22 AM UTC #

Post a Comment:
Comments are closed for this entry.