Coverage of signal routines in the kernel in the context of ptrace(2)
During the past month I have been working on coverage of various corner cases in the signal subsystem in the kernel. I have also spent some time on improvements in the land of sanitizers. As a mentor I was able to, thanks to the full-time focus on NetBSD work, actively help three Google Summer of Code students. Not every question would be answered by myself without code reading but at least I am available for active collaboration, especially when it's to improve code that I have already authored, like sanitizers. At the end of the month we have managed to catch two uninitialized memory reads in the top(1) utility, using the Memory Sanitizer feature and rebuilt part of the basesystem (i.e. library dependencies: libterminfo, libkvm, libutil) with dedicated sanitization flags.
ptrace(2) and related distribution changes
I am actively working on handling of processes, forks/vforks, signals and threads that is reliable and fully functional under a debugger. This is a process and the situation is actively improving. For the end-user this means that we are achieving the state when a developer will be able to trace an application like Firefox using modern tools and save time detecting the issues quickly.
I am using the Test-Driven Development approach in my work. I keep extending the Automatic Test Framework with new tests, covering sets of scenarios handled by debuggers and related code. This is followed by kernel fixes. Thanks to the tests, I can more confidently introduce changes to critical routines inside the Operating System, test new changes quickly for regressions and keep covering new verifiable scenarios.
Titles of the merged commits with the main tree of NetBSD:
- Remove an element from struct emul: e_tracesig.
e_tracesig used to be implemented for Darwin compat. Nowadays the Darwin compatib[i]lity layer is gone and there are no other users.
- Refactoring of can_we_set_dbregs() in ATF ptrace(2) tests. Push this auxiliary function to all ports.
- Add a new ptrace(2) ATF exploit for: CVE-2018-8897 (POP SS debug exception).
- Correct handling of: vfork(2) + PT_TRACE_ME + raise(2).
- Add a new ATF ptrace(2) test: traceme_vfork_breakpoint.
- Improve the description of traceme_vfork_raise in ATF ptrace(2) tests.
- Add a new ATF ptrace(2) test: traceme_vfork_exec.
- Improve the description of traceme_vfork_breakpoint (ATF ptrace(2) test).
- Add extra asserts in three ATF ptrace(2) tests.
In traceme* tests after validate_status_stopped() include additional check the verify the received signal with PT_GET_SIGINFO.
- Correct assert in ATF t_zombie test.
- Add new ATF tests: t_fork and t_vfork.
- Stop masking SIGSTOP in a vfork(2)ed child.
- Stop masking raise(SIGSTOP) in a vfork(2)ed child that called PT_TRACE_ME.
- Add new auxiliary functions in t_ptrace_wait.h
New functions:
- FORKEE_ASSERT_NEQ()
- await_stopped_child()
- Enable traceme_vfork_raise2 in ATF ptrace(2) tests.
raise(SIGSTOP) is now handled correctly by the kernel, in a child that vfork(2)ed and called PT_TRACE_ME.
- Cover SIGTSTP, SIGTTIN and SIGTTOU in traceme_vfork_raise ATF tests.
- Note in vfork(2) that SIGTSTP is masked.
- Fix and enable traceme_signal_nohandler2 in ATF ptrace(2) tests.
- Make stopsigmask a non-static symbol now as it's used in ptrace(2) code.
- Refactor and enable the signal3 ATF ptrace(2) test
Adapt the test to be independent from the software breakpoint trap behavior, whether the Program Counter is moved or not. Just kill the process after catching the expected signal, instead of pretending to resume it.
- Add new ATF test: t_trapsignal:trap_ignore.
- Minor update to signal(7)
Note that SIGCHLD is not just a child exit signal. Note that SIGIOT is PDP-11 specific signal.
- Minor improvement in sigaction(2)
Note that SIGCHLD covers process continued event.
- Extend ATF tests in t_trapsignal.sh to verify software breakpoint traps.
- Add new ATF ptrace(2) tests: traceme_sendsignal_{masked,ignored}[1-3].
- Define PTRACE_BREAKPOINT_ASM for i386 in the MD part of
. - Refactor the attach[1-8] and race1 ATF t_ptrace_wait* tests.
- Cherry-pick upstream patch for internal_mmap() in GCC sanitizers.
- Cherry-pick upstream patch for internal_mmap() in GCC(.old) sanitizers
- Add new auxiliary functions in ATF ptrace(2) tests
Introduce:
- trigger_trap()
- trigger_segv()
- trigger_ill()
- trigger_fpe()
- trigger_bus()
- Extend traceme_vfork_breakpoint in ATF ptrace(2) tests for more scenarios
Added tests:
- traceme_vfork_crash_trap
- traceme_vfork_crash_segv (renamed from traceme_vfork_breakpoint)
- traceme_vfork_crash_ill (disabled)
- traceme_vfork_crash_fpe
- traceme_vfork_crash_bus
- Merge the eventmask[1-6] ATF ptrace(2) tests into a shared function body.
- Introduce can_we_write_to_text() to ATF ptrace(2) tests
The purpose of this function is to detect whether a tracer can write to the .text section of its tracee.
- Refactor the PT_WRITE*/PT_READ* and PIOD_* ATF ptrace(2) tests.
- Handle vm.maxaddress in compat_netbsd32(8).
- Port the CVE 2018-8897 mitigation to i386 ATF ptrace(2) tests.
- Fix sysctl(3):vm.minaddress in compat_netbsd32(8).
- Fix ATF ptrace(2) bytes_transfer_piod_read_auxv test.
- Handle FPE and BUS scenarios in the ATF t_trapsignal tests.
- Try to fool $CC harder in ATF ptrace(2) tests in trigger_fpe().
- Correct reporting SIGTRAP TRAP_EXEC when SIGTRAP is masked.
- Correct the t_ptrace_wait*:signal5 ATF test case.
- Add new ATF ptrace(2) tests verifying crash signal handling.
- Harden PT_ATTACH in ptrace(2).
Don't allow to PT_ATTACH from a vfork(2)ed child (before exec(3)/_exit(3)) to its parent. Return error with EPERM errno.
This scenario does not have a purpose and there is no clear picture how to route signals.
- Simplify comparison of two processes
No need to check p_pid to compare whether two processes are the same.
This functionality now works.
LLVM compiler-rt features
I have helped the GSoC student to prepare for LLVM libfuzzer integration with the NetBSD base system. We have managed to get down to the following results for the test target in the upstream repository:
$ check-fuzzer-default Expected Passes : 105 Unsupported Tests : 8 Unexpected Failures: 2 $ check-fuzzer Expected Passes : 105 Unsupported Tests : 8 Unexpected Failures: 2 $ check-fuzzer-unit Expected Passes : 35
The remaining two failures appear to be false positives and specific to the differences between the NetBSD setup difference and other supported Operating Systems (including Linux). I have decided not to investigate them and instead to move on to more urgent tasks.
While there, I have been working on restoring a good state to userland LLVM sanitizers in the upstream repository, in order ship them in the NetBSD distribution along with the libfuzzer utility.
A number of patches were merged upstream:
- LLVM: Register NetBSD/i386 in AddressSanitizer.cpp
- Clang: Permit -fxray-instrument for NetBSD/amd64
- Clang: Support XRay in the NetBSD driver
- compiler-rt: Remove dead sanitizer_procmaps_freebsd.cc
- compiler-rt: wrong usages of sem_open in the libFuzzer (patch by Yang Zheng, the GSoC student)
- compiler-rt: Register NetBSD/i386 in asan_mapping.h
- compiler-rt: Setup ORIGIN/NetBSD option in sanitizer tests
- compiler-rt: Enable SANITIZER_INTERCEPTOR_HOOKS for NetBSD
There is also at least a single pending upstream patch that is worth to note: Introduce CheckASLR() in sanitizers
At least the ASan, MSan, TSan sanitizers require disabled ASLR on a NetBSD. Introduce a generic CheckASLR() routine, that implements a check for the current process. This flag depends on the global or per-process settings. There is no simple way to disable ASLR in the build process from the level of a sanitizer or during the runtime execution. With ASLR enabled sanitizers that operate over the process virtual address space can misbehave usually breaking with cryptic messages. This check is dummy for !NetBSD.
The current results for test targets in the compiler-rt features are as follows:
$ make check-builtins Expected Passes : 343 Expected Failures : 4 Unsupported Tests : 36 Unexpected Failures: 5 $ check-interception -- Testing: 0 tests, 0 threads -- $ check-lsan Expected Passes : 6 Unsupported Tests : 60 Unexpected Failures: 106 $ check-ubsan Expected Passes : 229 Expected Failures : 1 Unsupported Tests : 32 Unexpected Failures: 2 $ check-cfi Unsupported Tests : 232 $ check-cfi-and-supported BaseException: Tests unsupported $ make check-sanitizer Expected Passes : 576 Expected Failures : 13 Unsupported Tests : 206 Unexpected Failures: 31 $ check-asan Expected Passes : 852 Expected Failures : 4 Unsupported Tests : 440 Unexpected Failures: 16 $ check-asan-dynamic Expected Passes : 394 Expected Failures : 3 Unsupported Tests : 440 Unexpected Passes : 1 Unexpected Failures: 222 $ check-msan Expected Passes : 102 Expected Failures : 1 Unsupported Tests : 30 Unexpected Failures: 4 $ check-tsan Expected Passes : 288 Expected Failures : 1 Unsupported Tests : 84 Unexpected Failures: 8 $ check-safestack Expected Passes : 7 Unsupported Tests : 1 $ check-scudo Expected Passes : 14 Unexpected Failures: 28 $ check-ubsan-minimal Expected Passes : 6 Unsupported Tests : 2 $ check-profile Unsupported Tests : 116 $ check-xray Expected Passes : 21 Unsupported Tests : 1 Unexpected Failures: 21 $ check-shadowcallstack Unsupported Tests : 4
Sanitization of userland and the kernel
I am helping to setup the process for shipping a NetBSD userland that is prebuilt with a desired sanitizer. This involves consulting the Google Summer of Code student, fixing known issues, reviewing patches etc.
There were two new uninitialized memory read bugs detected in the top(1) program:
Fix unitialized signal mask passed to sigaction(2) in top(1) Detected with Memory Sanitizer during the integration of sanitizers with the NetBSD basesystem. Reported by <Yang Zheng>
Fix read of uni[ni]tialized array elements in top(1) The cp_old array is allocated with malloc(3) and its pointer is passed to percentages64(). In this function there happens a calculation of total_change, which value depends on the value inside the unitialized cp_old[] array. ==26662==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x268a2c in percentages64 /usr/src/external/bsd/top/bin/../dist/machine/m_netbsd.c:1341:6 #1 0x26748b in get_system_info /usr/src/external/bsd/top/bin/../dist/machine/m_netbsd.c:478:6 #2 0x25518e in do_display /usr/src/external/bsd/top/bin/../dist/top.c:507:5 #3 0x253038 in main /usr/src/external/bsd/top/bin/../dist/top.c:975:2 #4 0x21cad1 in ___start (/usr/bin/top+0x1cad1) SUMMARY: MemorySanitizer: use-of-uninitialized-value /usr/src/external/bsd/top/bin/../dist/machine/m_netbsd.c:1341:6 in percentages64 Exiting Fix this issue by chang[]ing malloc(3) with calloc(3). Detected with Memory Sanitizer during the integration of sanitizers with the NetBSD basesystem. Reported by <Yang Zheng>
As similar process happens with two kernel sanitizer GSoC tasks: kernel-ubsan and kernel-asan.
Thanks to the involvement to The NetBSD Foundation tasks, I can be reachable for students (although not always in all cases) for active feedback and collaboration.
Summary
The number of ATF ptrace(2) tests cases has been significantly incremented, however there is still a substantial amount of work to be done and a number of serious bugs to be resolved.
With fixes and addition of new test cases, as of today we are passing 1,206 (last month: 961) ptrace(2) tests and skipping 1 (out of 1,256 total; last month: 1,018 total). No counted here tests that appeared outside the ptrace(2) context.
Plan for the next milestone
Cover with regression tests remaining elementary scenarios of handling crash signals. Fix known bugs in the NetBSD kernel.
Follow up the process with the remaining fork(2) and vfork(2) scenarios.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:
http://netbsd.org/donations/#how-to-donate [0 comments]