Validation and improvements of debugging interfaces


June 05, 2019 posted by Kamil Rytarowski

In the past month, I have introduced correctness and reliability of tracing processes in the kernel codebase.
I took part in BSDCan 2019 and during the event wrote a NetBSD version of truss, a ptrace(2)-powered syscall tracing utility from FreeBSD. I've finished the port after getting back home and published it to the NetBSD community. This work allowed me to validate the ptrace(2) interfaces in another application and catch new problems that affect every ptrace(2)-based debugger.

Changes in the basesystem

A number of modifications were introduced to the main NetBSD repository which include ptrace(2), kernel sanitizer, and kernel coverage.

Improvements in the ptrace(2) code:

  • Reporting exec(3) events in the PT_SYSCALL mode. This simplifies debuggers adding MI interface to detect exec(3) of a tracee.
  • Relax prohibition of Program Counter set to 0x0 in ptrace(2). In theory certain applications can map Virtual Address 0x0 and run code there. In practice this will probably never happen in modern computing.
  • New ATF tests for user_va0_disable sysctl(3) node checking if PT_CONTINUE/PT_SYSCALL/PT_DETACH can set Program Counter to 0x0.
  • Fixing kernel races with threading code when reports for events compare with process termination. This unbroke tracing Go programs, which call exit(2) before collecting all the running threads.
  • Refreshed ptrace(2) and siginfo(2) documentation with recent changes and discoveries.

I mentor two GSoC projects and support community members working on kernel fuzzers. As a result of this work, fixes are landing in the distribution source code from time to time. We have achieved an important milestone with being able to run NetBSD/amd64 on real hardware with Kernel Undefined Behavior Sanitizer without any reports from boot and execution of ATF tests. This achievement will allow us to enable kUBSan reports as fatal ones and use in the fuzzing process, capturing bugs such as integer overflow, out of bounds array use, invalid shifts, etc.

I have also introduced a support of a new sysctl(3) operation: KERN_PROC_CWD. This call retrieves the current working directory of a specified process. This operation is used typically in terminal-related applications (tmux, terminator, ...) and generic process status prompters (Python psutils). I have found out that my terminal emulator after system upgrade doesn't work correctly and it uses a fallback of resolving the symbolic link of /proc/*/cwd.

ptrace(2) utilities

In order to more easily validate the kernel interfaces I wrote a number of utility programs that use the ptrace(2) functionality. New programs reuse the previously announced picotrace (https://github.com/krytarowski/picotrace) code as a framework.

Some programs have been already published, other are still in progress and kept locally on my disk only. New programs that have been published so far:

  • NetBSD truss - reimplementation from scratch of FreeBSD truss for the NetBSD kernel.
  • sigtracer - stripped down picotrace reporting signal events.
  • coredumper - crash detector and core(5) dumper regardless of signal handing or masking options in programs.

More elaborated introduction with examples is documented on the GitHub repository. I use these programs for profit as they save my precious time on debugging issues in programs and as validators for kernel interfaces as they are working closely over the kernel APIs (contrary to large code base of GDB or LLDB). I've already detected and verified various kernel ptrace(2) issues with these programs. With a debugger it's always unclear whether a problem is in the debugger, its libraries or the kernel.

Among the detected issues, the notable list of them is as follows:

  • Reporting events to a debugger in the same time from two or more sources is racy and one report can be registered and other dropped.
  • Either questionable or broken handling of process stopped by other process with a stop signal.
  • Attaching to sleeping process terminates it as interrupted nanosleep does not return EINTR.
  • Exiting of a process with running multiple threads broken (fixed).
  • Attaching to applications due to unknown reasons can sometimes kill them.
  • Callbacks from pthread_atfork(3) (can?) generate crash signals unhandled by a tracer.

Update in Problem Report entries

I've reiterated over reported bugs in the gnats tracking system. These problems were still valid in the beginning of this year and are now reported to be gone:

  • kern/52548 (GDB breakpoint does not work properly in the binary created with golang on NetBSD (both pkgsrc/lang/go and pkgsrc/lang/go14).)
  • kern/53120 (gdb issues with threaded programs)
  • bin/49662 (gdb has trouble with threaded programs)
  • port-arm/51677 (gdb can not debug threaded programs)
  • kern/16284 (running function in stoped gdb causes signal 93)
  • bin/49979 (gdb does not work properly on NetBSD-6.1.1/i386)
  • kern/52628 (tracing 32-bit application on 64-bit kernel panic)

Summary

There are still sometimes new bugs reported in ptrace(2) or GDB, but they are usually against racy ATF test and complex programs. I can also observe a difference that simple and moderately complex programs usually work now and the reports are for heavy ones like firefox (multiple threads and multiple processes).

I estimate that there are still at least 3 critical threading issues to be resolved, races and use-after-free scenaris in vfork(2) and a dozen of other more generic problems typically in signal routing semantics.

Plan for the next milestone

Cover with regression tests the posix_spawn(2) interface, if needed correct the kernel code. This will be followed by resolving the use-after-free scenarios in vfork(2). This is expected to accomplish the tasks related to forking code.

My next goal on the roadmap is to return to LWP (threading) code and fix all currently known problems.

Independently I will keep supporting the work on kernel fuzzing projects within the GSoC context.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

http://netbsd.org/donations/#how-to-donate [0 comments]

 



Post a Comment:
Comments are closed for this entry.