The first patch-bulk upstreamed to LLDB


February 14, 2017 posted by Kamil Rytarowski

The LLVM project is a quickly moving target, this also applies to the LLVM debugger -- LLDB. It's actively used in several first-class operating systems, while - thanks to my spare time dedication - NetBSD joined the LLDB club in 2014, only lately the native support has been substantially improved and the feature set is quickly approaching the support level of Linux and FreeBSD. During this work 12 patches were committed to upstream, 12 patches were submitted to review, 11 new ATF were tests added, 2 NetBSD bugs filed and several dozens of commits were introduced in pkgsrc-wip, reducing the local patch set to mostly Native Process Plugin for NetBSD.

What has been done in NetBSD

1. Triagged issues of ptrace(2) in the DTrace/NetBSD support

Chuck Silvers works on improving DTrace in NetBSD and he has detected an issue when tracer signals are being ignored in libproc. The libproc library is a compatibility layer for DTrace simulating /proc capabilities on the SunOS family of systems.

I've verified that the current behavior of signal routing is incorrect. The NetBSD kernel correctly masks signals emitted by a tracee, not routing them to its tracer. On the other hand the masking rules in the inferior process blacklists signals generated by the kernel, which is incorrect and turns a debugger into a deaf listener. This is the case for libproc as signals were masked and software breakpoints triggering INT3 on i386/amd64 CPUs and SIGTRAP with TRAP_BRKP si_code wasn't passed to the tracer.

This isn't limited to turning a debugger into a deaf listener, but also a regular execution of software breakpoints requires: rewinding the program counter register by a single instruction, removing trap instruction and restoring the original instruction. When an instruction isn't restored and further code execution is pretty randomly affected, it resulted in execution anomalies and breaking of tracee.

A workaround for this is to disable signal masking in tracee.

Another drawback inspired by the DTrace code is to enhance PT_SYSCALL handling by introducing a way to distinguish syscall entry and syscall exit events. I'm planning to add dedicated si_codes for these scenarios. While there, there are users requesting PT_STEP and PT_SYSCALL tracing at the same time in an efficient way without involving heuristcs.

I've filed the mentioned bug:

I've added new ATF tests:

  • Verify that masking single unrelated signal does not stop tracer from catching other signals
  • Verify that masking SIGTRAP in tracee stops tracer from catching this raised signal
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching software breakpoints
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching single step trap
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching exec() breakpoint
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching PTRACE_FORK breakpoint
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching PTRACE_VFORK breakpoint
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching PTRACE_VFORK_DONE breakpoint
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching PTRACE_LWP_CREATE breakpoint
  • Verify that masking SIGTRAP in tracee does not stop tracer from catching PTRACE_LWP_EXIT breakpoint

2. ELF Auxiliary Vectors

The ELF file format permits to transfer additional information for a process with a dedicated container of properties, it's named ELF Auxilary Vector. Every system has its dedicated way to read this information in a debugger from a tracee. The NetBSD approach is to transfer this vector with a ptrace(2) API PIOD_READ_AUXV. Our interface shares the API with OpenBSD. I filed a bug that our interface returns vector size of 8496 bytes, while OpenBSD has constant 64 bytes. It was diagnosed and fixed by Christos Zoluas that we were incorrectly counting bits and bytes and this enlarged the data streamlined. The bug was harmless and had no known side-effects besides large chunk of zeroed data.

There is also a prepared local patch extending NetBSD platform support to read information for this vector, it's primarily required for correct handling of PIE binaries. At the moment there is no interface similar to "info auxv" to the one from GDB. Unfortunately at the current stage, this code is still unused by NetBSD. I will return to it once the Native Process Plugin is enhanced.

I've filed the mentioned bug:

I've added new ATF test:

  • Verify PT_READ_AUXV called for tracee.

What has been done in LLDB

1. Resolving executable's name with sysctl(7)

In the past the way to retrieve a specified process' executable path name was using Linux-compatibile feature in procfs (/proc). The canonical solution on Linux is to resolve path of /proc/$PID/exe. Christos Zoulas added in DTrace port enhancements a solution similar to FreeBSD to retrieve this property with sysctl(7).

This new approach removes dependency on /proc mounted and Linux compatibility functionality.

Support for this has been submitted to LLDB and merged upstream:

2. Real-Time Signals

The key feature of the POSIX standard with Asynchronous I/O is to support Real-Time Signals. One of their use-cases is in .NET debugging facilities. Support for this set of signals was developed during Google Summer of Code 2016 by Charles Cui and reviewed and committed by Christos Zoulas.

I've extended the LLDB capabilities for NetBSD to recognize these signals in the NetBSDSignals class.

Support for this has been submitted to LLDB and merged upstream:

3. Conflict removal with system-wide six.py

The transition from Python 2.x to 3.x is still ongoing and will take a while. The current deadline support for the 2.x generation has been extended to 2020. One of the ways to keep both generations supported in the same source-code is to use the six.py library (py2 x py3 = 6.py). It abstracts commonly used constructs to support both language families.

The issue for packaging LLDB in NetBSD was to install this tiny library unconditionally to a system-wide location. There were several solutions to this approach:

  • drop Python 2.x support,
  • install six.py into subdirectory,
  • make an installation of six.py conditional.

The first solution would turn discussion into flamewar, the second one happened to be too difficult to be properly implemented as the changes were invasive and Python is used in several places of the code-base (tests, bindings...). The final solution was to introduce a new CMake option LLDB_USE_SYSTEM_SIX - disabled by default to retain the current behavior.

To properly implement LLDB_USE_SYSTEM_SIX, I had to dig into installation scripts combined in CMake and Python files. It wasn't helping that Python scripts were reinventing getopt(3) functionality.. and I had to alter it in order to introduce a new option --useSystemSix.

Support for this has been submitted to LLDB and merged upstream:

4. Do not pass non-POD type variables through variadic function

There was a long standing local patch in pkgsrc, added by Tobias Nygren and detected with Clang.

According to the C++11 standard 5.2.2/7:

Passing a potentially-evaluated argument of class type having a non-trivial copy constructor, a non-trivial move constructor, or a non-trivial destructor, with no corresponding parameter, is conditionally-supported with implementation-defined semantics.

A short example to trigger similar warning was presented by Joerg Sonnenberg:

#include <string>
#include <cstdarg>

void f(std::string msg, ...) {
  va_list ap;
  va_start(ap, msg);
}

This code compiled against libc++ gives:

test.cc:6:3: error: cannot pass object of non-POD type 'std::string' (aka 'basic_string<char, char_traits<char>,

allocator<char> >') through variadic function; call will abort at runtime [-Wnon-pod-varargs]

Support for this has been submitted to LLDB and merged upstream:

5. Add NetBSD support in Host::GetCurrentThreadID

Linux has a very specific thread model, where process is mostly equivalent to native thread and POSIX thread - it's completely different on other mainstream general-purpose systems. That said fallback support to translate pthread_t on NetBSD to retrieve the native integer identifier was incorrect. The proper NetBSD function to retrieve light-weigth process identification is to call _lwp_self(2).

Support for this has been submitted to LLDB and merged upstream:

6. Synchronize PlatformNetBSD with Linux

The old PlatformNetBSD code was based on the FreeBSD version. While the FreeBSD current one is still similar to the one from a year ago, it's inappropriate to handle a remote process plugin approach. This forced me to base refreshed code on Linux.

After realizing that PlatformPlugin on POSIX platforms suffers from code duplication, Pavel Labath helped out to eliminate common functions shared by other systems. This resulted in a shorter patch synchronizing PlatformNetBSD with Linux, this step opened room for FreeBSD to catch up.

Support for this has been submitted to LLDB and merged upstream:

7. Transform ProcessLauncherLinux to ProcessLauncherPosixFork

It is UNIX specific that signal handlers are global per application. This introduces issues with wait(2)-like functions called in tracers, as these functions tend to conflict with real-life libraries, notably GUI toolkits (where SIGCHLD events are handled).

The current best approach to this limitation is to spawn a forkee and establish a remote connection over the GDB protocol with a debugger frontend. ProcessLauncherLinux was prepared with this design in mind and I have added support for NetBSD. Once FreeBSD will catch up, they might reuse the same code.

Support for this has been submitted to LLDB and merged upstream:

8. Document that LaunchProcessPosixSpawn is used on NetBSD

Host::GetPosixspawnFlags was built for most POSIX platforms - however only Apple, Linux, FreeBSD and other-GLIBC ones (I assume Debian/kFreeBSD to be GLIBC-like) were documented. I've included NetBSD to this list..

Support for this has been submitted to LLDB and merged upstream:

  • Document that LaunchProcessPosixSpawn is used on NetBSD committed r293770

9. Switch std::call_once to llvm::call_once

There is a long-standing bug in libstdc++ on several platforms that std::call_once is broken for cryptic reasons. This motivated me to follow the approach from LLVM and replace it with homegrown fallback implementation llvm::call_once.

This change wasn't that simple at first sight as the original LLVM version used different semantics that disallowed straight definition of non-static once_flag. Thanks to cooperation with upstream the proper solution was coined and LLDB now works without known regressions on libstdc++ out-of-the-box.

Support for this has been submitted to LLVM, LLDB and merged upstream:

10. Other enhancements

I a had plan to push more code in this milestone besides the mentioned above tasks. Unfortunately not everything was testable at this stage.

Among the rescheduled projects:

  • In the NetBSD platform code conflict removal in GetThreadName / SetThreadName between pthread_t and lwpid_t. It looks like another bite from the Linux thread model. Proper solution to this requires pushing forward the Process Plugin for NetBSD.
  • Host::LaunchProcessPosixSpawn proper setting ::posix_spawnattr_setsigdefault on NetBSD - currently untestable.
  • Fix false positives - premature before adding more functions in NetBSD Native Process Plugin.

On the other hand I've fixed a build issue of one test on NetBSD:

Plan for the next milestone

I've listed the following goals for the next milestone.

  • mark exect(3) obsolete in libc
  • remove libpthread_dbg(3) from the base distribution
  • add new API in ptrace(2) PT_SET_SIGMASK and PT_GET_SIGMASK
  • add new API in ptrace(2) to resume and suspend a specific thread
  • finish switch of the PT_WATCHPOINT API in ptrace(2) to PT_GETDBREGS & PT_SETDBREGS
  • validate i386, amd64 and Xen proper support of new interfaces
  • upstream to LLDB accessors for debug registers on NetBSD/amd64
  • validate PT_SYSCALL and add a functionality to detect and distinguish syscall-entry syscall-exit events
  • validate accessors for general purpose and floating point registers

Post mortem

FreeBSD is catching up after NetBSD changes, e.g. with the following commit:

This move allows to introduce further reduction of code-duplication. There still is a lot of room for improvement. Another benefit for other software distributions, is that they can now appropriately resolve the six.py conflict without local patches.

These examples clearly show that streamlining NetBSD code results in improved support for other systems and creates a cleaner environment for introducing new platforms.

A pure NetBSD-oriented gain is improvement of system interfaces in terms of quality and functionality, especially since DTrace/NetBSD is a quick adopter of new interfaces.. and indirectly a sandbox to sort out bugs in ptrace(2).

The tasks in the next milestone will turn NetBSD's ptrace(2) to be on par with Linux and FreeBSD, this time with marginal differences.

To render it more clearly NetBSD will have more interfaces in read/write mode than FreeBSD has (and be closer to Linux here), on the other hand not so many properites will be available in a thread specific field under the PT_LWPINFO operation that caused suspension of the process.

Another difference is that FreeBSD allows to trace only one type of syscall events: on entry or on exit. At the moment this is not needed in existing software, although it's on the longterm wishlist in the GDB project for Linux.

It turned out that, I was overly optimistic about the feature set in ptrace(2), while the basic ones from the first milestone were enough to implement basic support in LLDB.. it would require me adding major work in heuristics as modern tracers no longer want to perform guessing what might happened in the code and what was the source of signal interruption.

This was the final motivation to streamline the interfaces for monitoring capabilities and now I'm adding remaining interfaces as they are also needed, if not readily in LLDB, there is DTrace and other software that is waiting for them now. Somehow I suspect that I will need them in LLDB sooner than expected.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue to fund projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate [0 comments]

 



Post a Comment:
Comments are closed for this entry.