Forking code support in ptrace(2)

July 04, 2019 posted by Kamil Rytarowski

I've finished all the planned tasks regarding fork(2), vfork(2), clone(2)/__clone(2), and posix_spawn(3) in the context of debuggers. There are no longer any known kernel issues for any of these calls. All of the calls are covered with ATF regression tests.

Debugger related changes

NetBSD might be the only mainstream OS that implements posix_spawn(3) with a dedicated syscall. All the other recognized kernels have an implementation as a libc wrapper around either fork(2), vfork(2) or clone(2). I've introduced a new ptrace(2) event, PTRACE_POSIX_SPAWN, improved the posix_spawn(3) documentation, and introduced new ATF regression tests for posix_spawn(3).

The new ptrace(2) code has been exercised under load with LLDB test-suite, picotrace, and NetBSD truss.

I intend to resume porting of edb-debugger from work I began two years ago. I hope to use it for verifying FPU registers. I've spent some time porting this debugger to NetBSD/amd64 and managed to get a functional process attached. Unfortunately the code is highly specific to a single Operating System. That OS is the only one that is really functional at the moment and the code needs rework to be more agnostic with respect to the different semantics of kernels.

Issues with threaded debugging

I've analyzed the problems with multiple thread programs under debuggers. They could be classified into the following groups:

There is suboptimal support in GDB/NetBSD. However, events such as software breakpoint, watchpoint, and single step trap are reported correctly by the kernel. A debugger can detect the exact reason of interruption of execution. Unfortunately, the current NetBSD code tries to be compatible with Linux which prevents the NetBSD kernel from being able to precisely point to the stop reason. This causes GDB to randomly misbehave, e.g. single step over forking code.
Events reported from two or more events compete in the kernel for being reported to a debugger. A selection of events that lost this race will not be delivered as the winner event will overwrite signal information for a debugger from losing threads/events.
PT_RESUME/PT_SUSPEND shares the same suspended bit as pthread_suspend_np(3) and pthread_resume_np(3) (_lwp_suspend(2), _lwp_continue(2)).
In some circumstances event from an exiting thread never hits the tracer (WAIT() vs LWP_EXIT() race).
Corner cases that can cause a kernel crash or kill trace can negatively effect the debugging process.

I've found that focusing on kernel correctness now and fixing thread handling bugs can have paradoxically random impact on GDB/NetBSD. The solution for multiple events reported from multiple threads concurrently has a fix in progress, but the initial changes caused some fallout in GDB support so I have decided to revert it for now. This pushed me to the conclusion that before fixing LWP events, there is a priority to streamline the GDB support: modernize it, upstream it, run regression tests.

Meanwhile there is an ongoing work on covering more kernel code paths with fuzzers. We will catch and address problems out there only because they're able to be found. I'm supervising 3 ongoing projects in this domain: syzkaller support enhancements, TriforceAFL porting and AFL+KCOV integration. This work makes the big picture of what is still needed to be fixed clearer and lowers the cost of improving the quality.

LSan/NetBSD

The original implementation of Leak Sanitizer for NetBSD was developed for GCC by Christos Zoulas. The whole magic of a functional LSan software is the Stop-The-World operation. This means that a process suspends all other threads while having the capability to investigate the register frame and the stacks of suspended threads.

Until recently there were two implementations of LSan: for Linux (ab)using ptrace(2) and for Darwin using low-level Mach interfaces. Furthermore the Linux version needs a special version of fork(2) and makes assumptions about the semantics of the signal kernel code.

The original Linux code must separately attach to each thread via a pid. This applies to all other operations, such as detaching or killing a process through ptrace(2). Additionally listing threads of a debugged process on Linux is troublesome as there is need to iterate through directories in /proc.

The implementation in GCC closely reused the semantics of Linux. There was room for enhancement. I've picked this code as an inspiration and wrote a clean implementation reflecting the NetBSD kernel behavior and interfaces. In the end the NetBSD code is simpler than the Linux code without needing any recovery fallbacks or port specific kludges (in Linux every CPU needs discrete treatment.)

Much to my chagrin, this approach abuses the ptrace(2) interface, making sanitizing for leaking programs incompatible with debuggers. The whole StopTheWorld() operation could be cleanly implemented on the kernel side as a new syscall. I have a semicompleted implementation of this syscall, however I really want to take care of all threading issues under ptrace(2) before moving on. The threading issues must be fully reliable in one domain of debugging before implementing other kernel code. Both LLVM 9.0 and NetBSD-9 are branching soon and it will be a good enough solution for the time being. My current goal before the LLVM branching is to address a semantic behavioral difference in atexit(3) that raises a false positive in LSan tests. Request for comments on this specific atexit(3) issue will be available pending feedback from upstream.

Plan for the next milestone

Modernize GDB/NetBSD support, upstream it, and run GDB regression tests for NetBSD/amd64. Switch back to addressing threading kernel issues under a debugger.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

http://netbsd.org/donations/#how-to-donate [0 comments]

« Write your own fuzze... | Main | LLDB: watchpoints,... »