MKSANITIZER - bug detector software integration with the NetBSD userland

July 02, 2018 posted by Kamil Rytarowski

I've finished the integration of sanitizers with the distribution build framework. A bootable and installable distribution is now available, verified with Address Sanitizer, with Undefined Behavior Sanitizer, or with both concurrently. A few dozen bugs were detected and the majority of them addressed.

LLVM sanitizers are compiler features that help find common software bugs. The following sanitizers are available:

  • TSan: Finds threading bugs,
  • MSan: Finds uninitialized memory read,
  • ASan: Finds invalid address usage bugs,
  • UBSan: Finds unspecified code semantics in runtime.

The new MKSANITIZER option supports full coverage of the NetBSD code base with these sanitizers, which helps reduce bugs and serve high security demands.

A brief overview of MKSANITIZER

A sanitizer is a special type of addition to a compiled program, and is included from a toolchain (LLVM or GCC). There are a few types of sanitizers. Their usual purposes are: bug detecting, profiling, and security hardening.

NetBSD already supports the most useful ones with a decent completeness:

  • Address Sanitizer (ASan, memory usage bug detector),
  • Undefined Behavior Sanitizer (UBSan, unspecified semantics in runtime detector),
  • Thread Sanitizer (TSan, data race detector), and
  • Memory Sanitizer (MSan, uninitialized memory read detector).

It's possible to combine compatible sanitizers in a single application; NetBSD and MKSANITIZER support doing so.

There are various advantages and limitations. Properties and requirements vary, mainly reflecting the type of sanitization. Comparisons against other software with similar properties (such as Valgrind) may provide a fuller picture.

Sanitizers usually introduce a relatively small overhead (~2x) compared to Valgrind (~20x). The portability is decent as the sanitizers don't depend heavily on the underlying CPU architecture, and in the UBSan case they basically work on everything including VAX. In the Valgrind case the portability is extremely dependent on the kernel and CPU, thus making this diagnostic tool very difficult to port across platforms. ASan, MSan and TSan require large addressable memory due to their design. This restricts MSan and TSan to 64-bit architectures with a lot of RAM, with ASan for ones that cover completely all of the 4GB (32-bit) address space (it's still possible to use small resources with ASan but it's a tradeoff between usability, time investment, and gain). Although the memory usage is higher with sanitized programs, the modern design and implementation of the memory management subystem in the NetBSD kernel allows to manage it lazily and regardless of reserving TBs of buffers for metadata, the physically used memory is significantly lower usually doubling the regular memory usage by a process. Memory demands are higher for processes that are in the process of fuzzing and thus there is an option to restrict the maximum number of used physical pages that will cause the program to halt (by default 2GB for libFuzzer). A selection of LLVM Sanitizers may conflict with some tools (like Valgrind) and mechanisms (like PaX ASLR in the ASan, TSan and MSan case). Other ones like PaX MPROTECT (sometimes called W^X) are fully compatible with all the currently supported sanitizers.

The main purposes of sanitizations from a user point of view are:

  • bug detecting and assuring correctness,
  • high security demands, and
  • auxiliary feature for fuzzing.

It's worth adding a few notes on the security part as there are numerous good security approaches. One of them is proactive secure coding that is a regime of using safe constructs in the source code and replacement of functions that are prone to errors with versions that are harder to misuse.

However the disadvantage of this approach is that it's just a regime in the coding period. The probability of introducing a bug is minimized, however it does still exist. A problem that is in a program of either style (proactive secure style and careless coding) are almost indistinguishable in the final product and an attacker can use the same methods to violate the program like integer overflow or use after free.

The usual way to prevent bugs is to assume that a code is buggy and add mitigation that will aim to reduce the chance to exploit it. An example of this is the sandboxing of an application.

A code that is aided with sanitizers can be configured, either at build-time or run-time, to report the bug in the execution time of e.g. integer overflow and cause an application to halt immediately. No coding regime can have the same effect and perhaps the number of programming languages with this property is also limited.

In order to use sanitizers effectively within a distribution there is need to rebuild a program and all of its dependencies (with few exceptions) with the same sanitizing configuration. Furthermore, in order to use some versions of fuzzing engines with some types of sanitizers we need to build the fuzzing libraries with the same sanitization as well (this is true for e.g. Memory Sanitizer used together with libFuzzer).

This was my primary motivation towards introduction of a new NetBSD distribution build option: MKSANITIZER.

NetBSD is probably the only distribution that ships with a fully sanitized distribution option. Today there is "just" need for a locally patched external LLVM toolchain and the work on this is still ongoing.

The whole userland sanitization skips not applicable exceptions:

  • low-level libc libraries crt0, crtbegin, crtend, crti, crtn etc,
  • libc,
  • libm,
  • librt,
  • libpthread,
  • bootloader,
  • crunchgen programs like rescue,
  • dynamic ELF loader (implemented as a library),
  • as of today static libraries and executables,
  • as of today as an exception ldd(1) that borrows parts from the dynamic ELF loader.
The selection of unsanitized base libraries like libc is the design choice of sanitizers that a part of the base code is unsanitized and sanitizers install interceptors for their public symbols. Sanitizers expect to use their API from high level, their features and so prevent recursive sanitization (although this happens sometimes in narrow cases). A good illustration of this design choice is the process of sanitization of users of the threading library. Sanitizers and TSan in particular register interceptors for the public symbols of libpthread and treat it mostly as a black box (there are few exceptions). As an alternative with a fully sanitized libpthread, there would need to be fully OS dependent implementation of each feature in sanitizers based on the selection of kernel features, handle relatively opaque syscalls, CPU specific differences in the implementation etc... and in the end it would be very difficult without the full reimplementation of libpthread to handle operations like pthread_join(3).

The sanitization of static programs as of today is a low priority and falls outside the scope of my work.

The situation with ldd(1) will be cleared in future and it will be most probably sanitized.

Kernel and kernel modules use a different version of sanitizers and the porting process of Kernel-AddressSanitizer and Kernel-UndefinedBehaviorSanitizer is ongoing out of the MKSANITIZER context.

There used to be an analogous attempt in the Gentoo land (asantoo), however these efforts stalled two years ago. The Google Chromium team uses a set of scripts to bootstrap sanitized dependencies for their programs on top of a Linux distribution (as of today Ubuntu Trusty x86_64).

I've started to document bugs detected with MKSANITIZER in a dedicated directory on my NetBSD homepage with my code and notes. So far there are 35 documented findings. Most of them are real problems in programs, some of them might be considered overcautious (mostly ones detected with UBSan) and probably all of them are without serious security risk or privilege escalation or system crash. Some of the findings (0029-0035 - MemorySanitizer userland one) contain problems located probably in sanitizers (the proper NetBSD support in them).

This list presents that some of the problems are located in formally externally-maintained software like tmux, heimdal, grep, nvi or nawk.

I think that the following patch is a good example of a good finding for a privileged (setuid) program passwd(1) that reads a vector out of bounds and write a null character into a random byte on the stack (documented as report 0024).

From 28dd358940af30f434a930fd1977e3bf2b69dcb1 Mon Sep 17 00:00:00 2001
From: kamil 
Date: Sun, 24 Jun 2018 01:53:14 +0000
Subject: [PATCH] Prevent underflow buffer read in trim_whitespace() in

If a string is empty or contains only white characters, the algorithm of
removal of white characters at the end of the passed string will read
buffer at index -1 and keep iterating backward.

Detected with MKSANITIZER/ASan when executing passwd(1).
 lib/libutil/passwd.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/libutil/passwd.c b/lib/libutil/passwd.c
index 9cc1d481a349..cee168e7d678 100644
--- a/lib/libutil/passwd.c
+++ b/lib/libutil/passwd.c
@@ -1,4 +1,4 @@
-/*	$NetBSD: passwd.c,v 1.52 2012/06/25 22:32:47 abs Exp $	*/
+/*	$NetBSD: passwd.c,v 1.53 2018/06/24 01:53:14 kamil Exp $	*/
  * Copyright (c) 1987, 1993, 1994, 1995
@@ -31,7 +31,7 @@
 #if defined(LIBC_SCCS) && !defined(lint)
-__RCSID("$NetBSD: passwd.c,v 1.52 2012/06/25 22:32:47 abs Exp $");
+__RCSID("$NetBSD: passwd.c,v 1.53 2018/06/24 01:53:14 kamil Exp $");
 #endif /* LIBC_SCCS and not lint */
@@ -503,13 +503,21 @@ trim_whitespace(char *line)
 	_DIAGASSERT(line != NULL);
+	/* Handle empty string */
+	if (*line == '\0')
+		return;
 	/* Remove leading spaces */
 	p = line;
 	while (isspace((unsigned char) *p))
 	memmove(line, p, strlen(p) + 1);
-	/* Remove trailing spaces */
+	/* Handle empty string after removal of whitespace characters */
+	if (*line == '\0')
+		return;
+	/* Remove trailing spaces, line must not be empty string here */
 	p = line + strlen(line) - 1;
 	while (isspace((unsigned char) *p))

The first boot of a MKSANITIZER distribution with Address Sanitizer

The process of getting a bootable and installable (and ignoring the aspect of buildable and generatable) installation ISO image was a loop of fixing bugs and retrying the process. At the end of the process there is an option to install a fully sanitized userland with ASan, UBSan or both. The MSan version is scheduled after finishing the kernel ptrace(2) work. Other options like a target prebuilt with ThreadSanitizer, safestack or The Scudo Hardened Allocator are untested.

I have also documented an example of the Heimdal bug that appeared during the login attempt (and actually preventing it) to a fully ASanitized userland:

This particular issue has been fixed with the following patch:

From ddc98829a64357ad73af0d0fa60c8d9c8499cce3 Mon Sep 17 00:00:00 2001
From: kamil 
Date: Sat, 16 Jun 2018 18:51:36 +0000
Subject: [PATCH] Do not reference buffer after the code scope {}

rk_getpwuid_r() returns a pointer pwd->pw_dir to a buffer pwbuf[].

It's not safe to store another a copy of pwd->pw_dir in outter scope and
use it out of the scope where there exists pwbuf[].

This fixes a problem reported by ASan under MKSANITIZER.
 crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c b/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c
index 47cb4481962e..6af30502ed5e 100644
--- a/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c
+++ b/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c
@@ -1,4 +1,4 @@
-/*	$NetBSD: config_file.c,v 1.3 2017/09/08 15:29:43 christos Exp $	*/
+/*	$NetBSD: config_file.c,v 1.4 2018/06/16 18:51:36 kamil Exp $	*/
  * Copyright (c) 1997 - 2004 Kungliga Tekniska Hogskolan
@@ -430,6 +430,8 @@ krb5_config_parse_file_multi (krb5_context context,
     if (ISTILDE(fname[0]) && ISPATHSEP(fname[1])) {
 	const char *home = NULL;
+	struct passwd pw, *pwd = NULL;
+	char pwbuf[2048];
 	if (!_krb5_homedir_access(context)) {
 	    krb5_set_error_message(context, EPERM,
@@ -441,9 +443,6 @@ krb5_config_parse_file_multi (krb5_context context,
 	    home = getenv("HOME");
 	if (home == NULL) {
-	    struct passwd pw, *pwd = NULL;
-	    char pwbuf[2048];
 	    if (rk_getpwuid_r(getuid(), &pw, pwbuf, sizeof(pwbuf), &pwd) == 0)
 		home = pwd->pw_dir;

Sending this patch upstream is on my TODO list, this means that other projects can benefit from this work. A single patch preventing NULL pointer arithmetic for tmux has been already submitted upstream and merged.

After the process of long run of booting newer versions of locally patched distribution I've finally entered the functional shell.

And a stored "copy-pasted" terminal screenshot after login into a shell:

also known as NetBSD-current.  It is very possible that it has serious bugs,
regressions, broken features or other problems.  Please bear this in mind
and use the system with care.

You are encouraged to test this version as thoroughly as possible.  Should you
encounter any problem, please report it back to the development team using the
send-pr(1) utility (requires a working MTA).  If yours is not properly set up,
use the web interface at:

Thank you for helping us test and improve NetBSD.

We recommend that you create a non-root account and use su(1) for root access.
qemu# uname -a
NetBSD qemu 8.99.19 NetBSD 8.99.19 (GENERIC) #12: Sat Jun 16 02:39:37 CEST 2018
 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64
qemu# nm /bin/ksh |grep asan|grep init
0000000000439bf8 B _ZN6__asan11asan_initedE
0000000000439bfc B _ZN6__asan20asan_init_is_runningE
00000000004387a1 b _ZN6__asanL14tsd_key_initedE
0000000000430f18 b _ZN6__asanL20dynamic_init_globalsE
000000000043a190 b _ZZN6__asan18asanThreadRegistryEvE11initialized
00000000000cfaf0 T __asan_after_dynamic_init
00000000000cf8a0 T __asan_before_dynamic_init
0000000000199b50 T __asan_init

The sshd(8) crash has been fixed by Christos Zoulas. There are still at least 2 ASan unfixed bugs left in the installer and few ones that prevent booting and using the distribution without noting that the sanitizers are enabled. The most notorious ones are ssh(1) & sshd(8) startup breakage and egrep(1) misbehavior in corner cases, both might be false positives and bugs in the sanitizers.

Validation of the MKSANITIZER=yes distribution

I've managed to execute the ATF regression tests against a sanitized distribution prebuilt with Address Sanitizer and in another attempt against Undefined Behavior Sanitizer.

In my setup of the external toolchain I had broken C++ runtime library caused with a complicated bootstrap chain. The process of building various LLVM projects from a GCC distribution requires generic work with the LLVM projects and there is need to build and reuse intermediate steps. For example, the compiler-rt project that contains various low-level libraries (including sanitizers) requires Clang as the compiler, as otherwise it's not buildable. This is the reason why I've deferred testing all the features in the current stage and I'm trying to coordinate with the maintainer Joerg Sonnenberger the process of upgrading the LLVM projects in the NetBSD distribution. I will reuse it to rebase the patches of mine and ship a readme text to users and other developers expecting to run a release with the MKSANITIZER option.

The lack of C++ runtime pushed me towards reusing non-sanitized ATF tests (as the ATF framework is written in C++) against the sanitized userland. Two bugs have been detected:

  • expr(1) triggering Undefined Behavior in the routines detecting overflow in arithmetic operations,
  • sh(1) use after free in corner case of redefining an active function.

I've addressed the expr(1) issues and added new ATF tests in order to catch regressions in future potential changes. The Almquist Shell bug has been reported to the maintainer K. Robert Elz and fixed accordingly.

libFuzzer integration with the userland programs

During the Google Summer of Code project: libFuzzer integration with the basesystem by Yang Zheng it has been detected that the original expr(1) fix introduced by myself is not fully correct.

Yang Zheng has detected that the new version of expr(1) is still crashing in narrow cases. I've checked his integration patch of expr(1) with libFuzzer, reproduced the problem myself and documented:

$ ./expr -only_ascii=1 -max_len=32 -dict=expr-dict expr_corpus/ 1>/dev/null 
Dictionary: 12 entries
INFO: Seed: 2332047193
INFO: Loaded 1 modules   (725 inline 8-bit counters): 725 [0x7a11f0, 0x7a14c5), 
INFO: Loaded 1 PC tables (725 PCs): 725 [0x579d18,0x57ca68), 
INFO:      269 files found in expr_corpus/
INFO: seed corpus: files: 269 min: 1b max: 31b total: 3629b rss: 29Mb
expr.y:377:12: runtime error: signed integer overflow: 9223172036854775807 * -3 cannot be represented in type 'long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior expr.y:377:12 in 
MS: 0 ; base unit: 0000000000000000000000000000000000000000
9223172036854775807 * -3
artifact_prefix='./'; Test unit written to ./crash-9c3dd31298882557484a14ce0261e7bfd38e882d
Base64: OTIyMzE3MjAzNjg1NDc3NTgwNyAqIC0z

And the offending operation is INT * -INT:

$ eval ./expr-ubsan '9223372036854775807 \* -3' expr.y:377:12: runtime error: signed integer overflow: 9223372036854775807 * -3 cannot be represented in type 'long' -9223372036854775805

This has been fixed as well and the set of ATF tests for expr(1) extended for missing scenarios.

MKSANITIZER implementation

The initial implementation of MKSANITIZER has been designed and implemented by Christos Zoulas. I took this code and continued working on it with an external LLVM toolchain (version 7svn with local patches). The final result has been documented in share/mk/bsd.README:

MKSANITIZER     if "yes", use the selected sanitizer to compile userland
                programs as defined in USE_SANITIZER, which defaults to
                "address". A selection of available sanitizers:
                        address:        A memory error detector (default)
                        thread:         A data race detector
                        memory:         An uninitialized memory read detector
                        undefined:      An undefined behavior detector
                        leak:           A memory leak detector
                        dataflow:       A general data flow analysis
                        cfi:            A control flow detector
                        safe-stack:     Protect against stack-based corruption
                        scudo:          The Scudo Hardened allocator
                It's possible to specify multiple sanitizers within the
                USE_SANITIZER option (comma separated). The USE_SANITIZER value
                is passed to the -fsanitize= argument to the compiler.
                Additional arguments can be passed through SANITIZERFLAGS.
                The list of supported features and their valid combinations
                depends on the compiler version and target CPU architecture.

As an illustration, in order to build a distribution with ASan and UBSan, using the LLVM toolchain one needs to enter a command line like:

./ -V MKLLVM=yes -V MKGCC=no -V HAVE_LLVM=yes -V MKSANITIZER=yes -V USE_SANITIZER="address,undefined" distribution

There is an ongoing effort on upstreaming the remaining toolchain patches and right now we need to use a specially preprocessed external LLVM toolchain with a pile of local patches.

The GCC toolchain is a downstream for LLVM sanitizers and is out of the current focus, although there are local NetBSD patches for ASan, UBSan and LSan in GCC's libsanitizer. Starting with GCC 8.x, there is the first upstreamed block of NetBSD code pulled in from LLVM sanitizers.

Golang and TSan (-race)

There has been finally merged the compiler-rt update patch in Golang.

runtime/race: update most syso files to compiler-rt fe2c72

These were generated using the racebuild configuration from, with the LLVM compiler-rt repository at
commit fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 for most platforms.

The Windows build is from an older compiler-rt revision, because the
compiler-rt build script for the Go race detector has been broken
since January 2017 (

Updates #24354.

Change-Id: Ica05a5d0545de61172f52ab97e7f8f57fb73dbfd
Reviewed-by: Brad Fitzpatrick 
Run-TryBot: Brad Fitzpatrick 
TryBot-Result: Gobot Gobot 

This means that the TSan/amd64 support syzo file has been included for NetBSD next to Darwin, FreeBSD and Linux (Windows is broken and no longer maintained). There is still need to merge the remaining patches for shell scripts and go files, and the code is still in review waiting for feedback.

Changes merged with the NetBSD sources

  • ksh: Remove symbol clash with libc -- rename twalk() to ksh_twalk()
  • ktruss: Remove symbol clash with libc -- rename wprintf() to xwprintf()
  • ksh: Remove symbol clash with libc -- rename glob() to ksh_glob()
  • Don't pass -z defs to libc++ with MKSANITIZER=yes
  • Mark sigbus ATF tests in t_ptrace_wait as expected failure
  • Make new DTrace and ZFS code buildable with Clang/LLVM
  • Fix the MKGROFF=no MKCXX=yes build
  • Correct Undefined Behavior in ifconfig(8)
  • Correct Undefined Behavior in libc/citrus
  • Correct Undefined Behavior in gzip(1)
  • Do not use index out of bounds in nawk
  • Change type of tilde_ok from int to unsigned int in ksh(1)
  • Rework perform_arith_op() in expr(1) to omit Undefined Behavior
  • Add 2 new expr(1) ATF tests
  • Prevent Undefined Behavior in shift of signed integer in grep(1)
  • Set NOSANITIZER in i386 mbr files
  • Disable sanitizers for libm and librt
  • Avoid Undefind Behavior in DEFAULT_ALIGNMENT in GNU grep(1)
  • Detect properly overflow in expr(1) for 0 + INT
  • Make the alignof() usage more portable in grep(1)
  • heimdal: Do not reference buffer after the code scope {}
  • Do not cause Undefined Behavior in vi(1)
  • Disable MKSANITIZER in lib/csu
  • Disable SANITIZER for ldd(1)
  • Set NOSANITIZER in rescue/Makefile
  • Add new option -s to crunchgen(1) -- enable sanitization
  • Make building of dhcp compatible with MKSANITIZER
  • Refactor MKSANITIZER flags in mk rules
  • Specify NOSANITIZER in distrib/amd64/ramdisks/common
  • Fix invalid free(3) in sysinst(8)
  • Fix integer overflow in installboot(8)
  • Specify -Wno-format-extra-args for Clang/LLVM in gpl2/gettext
  • sysinst: Enlarge the set_status[] array by a single element
  • Prevent underflow buffer read in trim_whitespace() in libutil/passwd.c
  • Fix stack use after scope in libutil/pty
  • Prevent signed integer left shift UB in FD_SET(), FD_CLR(), FD_ISSET()
  • Enhance the documentation of MKSANITIZER in bsd.README
  • Avoid unportable offsetof(3) calculation in nvi in log1.c
  • Add a framework for renaming symbols in libc&co for MKSANITIZER
  • Specify SANITIZER_RENAME_SYMBOL in diffutils
  • Specify SANITIZER_RENAME_SYMBOL in chpass
  • Include for offsetof(3)
  • Avoid UB in tmux/window_copy_add_formats()
  • Document sanitizers in acronyms.comp
  • Add TODO.sanitizer
  • Avoid misaligned access in disklabel(8) in find_label() (patch by Christos Zoulas)
  • Improve the * operator handling in expr(1)
  • Add a couple of new ATF expr(1) tests
  • Add a missing check to handle correctly 0 * 0 in expr(1)
  • Add 3 more expr(1) ATF tests detecting overflow

Changes merged with the LLVM projects

  • LLVM: Handle NetBSD specific path in findDebugBinary()
  • compiler-rt: Disable recursive interceptors in signal(3)/MSan
  • Introduce CheckASLR() in sanitizers

Plan for the next milestone

The ptrace(2) tasks have been preempted by the suspended work on sanitizers, in order to actively collaborate with the Google Summer of Code students (libFuzzer integration with userland, KUBSan, KASan).

I have planned the following tasks before returning back to the ptrace(2) fixes:

  • upgrade base Clang/LLVM, libcxx, libcxxabi to at least 7svn (HEAD) (needs cooperation with Joerg Sonnenberger)
  • compiler-rt import and integration with base (needs cooperation with Joerg Sonnenberger)
  • merge TSan, MSan and libFuzzer ATF tests
  • prepare MKSANITIZER readme
  • kernel-asan port
  • kernel-ubsan port
  • switch syscall(2)/__syscall(2) to libc calls
  • upstream local patches, mostly to compiler-rt
  • develop fts(3) interceptors (MSan, for ls(1), find(1), mtree(8)
  • investigate and address the libcxx failing tests on NetBSD
  • no-ASLR boot.cfg option, required for MKSANITIZER
My plan for the next milestone is to reduce the the list and keep actively collaborating with the summer students.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can: [1 comment]



> NetBSD is probably the only distribution that ships with a fully sanitized distribution option. Just wanted to point out that full sanitization is available for Tizen, Gentoo and AFAIK Android.

Posted by Yury Gribov on July 03, 2018 at 11:01 AM UTC #

Post a Comment:
Comments are closed for this entry.