Stabilization of the ptrace(2) threads

October 10, 2019 posted by Kamil Rytarowski

I have introduced changes that make debuggers more reliable in threaded scenarios. Additionally, I have enhanced Leak Sanitizer support and introduced various improvements in the basesystem.

Threading support

Threads and synchronization in the kernel, in general, is an evergreen task of the kernel developers. The process of enhancing support for tracing multiple threads has been documented by Michal Gorny in his LLDB entry Threading support in LLDB continued.

Overall I have introduced these changes:

  • Separate suspend from userland (_lwp_suspend(2)) flag from suspend by a debugger (PT_SUSPEND). This removes one of the underlying problems of threading stability as a debuggee was able to accidentally unstop suspended thread. This property is needed whenever we want to trace a selection (typically single entity) of threads.
  • Store SIGTRAP event information inside siginfo_t, rather than in struct proc. A single signal can only be reported at the time to the debugger, and its context is no longer prone to be overwritten by concurrent threads.
  • Change that introduces restarts in functions notifying events for debuggers. There was a time window between registering an event by a thread, stopping the process and unlocking mutexes of the process; as another process could take the mutexes before being stopped and overwrite the event with its own data. Now each event routine for debugger checks whether a process is already stopping (or demising or no longer being tracked) and preserves the signal to be emitted locally in the context of the lwp local variable on the stack and continues stopping self as requested by the other LWP. Once the thread is awaken, it retries to emit the signal and deliver the event signal to the debugger.
  • Introduce PT_STOP, that combines kill(SIGSTOP) and ptrace(PT_CONTINUE,SIGSTOP) semantics in a single call. It works like:
    • kill(SIGSTOP) for unstopped tracee
    • ptrace(PT_CONTINUE,SIGSTOP) for stopped tracee
    The child will be stopped and always possible to be waited (with wait(2) like calls).

    For stopped tracee kill(SIGSTOP) has no effect. PT_CONTINUE+SIGSTOP cannot be used on an unstopped process (EBUSY).

    This operation is modeled after PT_KILL that is similar for the SIGKILL call. While there, allow PT_KILL on unstopped traced child.

    This operation is useful in an abnormal exit of a debugger from a signal handler, usually followed by waitpid(2) and ptrace(PT_DETACH).

For the sake of tracking the missed in action signals emitted by tracee, I have introduced the feature in NetBSD truss (as part of the picotrace repository) to register syscall entry (SCE) and syscall exit (SCX) calls and track missing SCE/SCX events that were never delivered. Unfortunately, the number of missing events was huge, even for simple 2-threaded applications.

    truss[2585] running for 22.205305922 seconds
    truss[2585] attached to child=759 ('firefox') for 22.204289369 seconds
    syscall                     seconds      calls     errors missed-sce missed-scx
    read                    0.048522952        609          0         54         76
    write                   0.044693735        487          0         35         66
    open                    0.002516815         18          0          5          5
    close                   0.001015263         17          0          9          6
    unlink                  0.001375463         13          0          3          0
    getpid                  0.093458089       1993          0         16         56
    geteuid                 0.000049301          1          0          0          1
    recvmsg                 0.343353019       4828       3685         90        112
    access                  0.001450653         12          3          5          4
    dup                     0.000570904         10          0          0          1
    munmap                  0.010375949         88          0          6          3
    mprotect                0.196781932       2251          0         11         62
    madvise                 0.049820002        430          0         11         18
    writev                  0.237488362       1507          0         76         67
    rename                  0.000379918          2          0          1          0
    mkdir                   0.000283846          2          2          1          2
    mmap                    0.033342935        481          0         15         40
    lseek                   0.003341775         62          0         25         24
    ftruncate               0.000507707          9          0          1          0
    __sysctl                0.000144506          2          0          0          0
    poll                   18.694195617       4531          0        106        191
    __sigprocmask14         0.001585329         20          0          0          2
    getcontext              0.000083238          1          0          0          0
    _lwp_create             0.000104646          1          0          0          0
    _lwp_self               0.001456718         22          0         24         79
    _lwp_unpark             0.035319633        607          0         14         39
    _lwp_unpark_all         0.020660377        250          0         38         50
    _lwp_setname            0.000118418          2          0          0          0
    __select50             15.125525493        637          0         82        125
    __gettimeofday50        3.279021049       2930          0         40        135
    __clock_gettime50      10.673311747      33132          0       1418       3003
    __stat50                0.006375356         52          3         12          5
    __fstat50               0.001490944         17          0          3          2
    __lstat50               0.000110906          1          0          1          0
    __getrusage50           0.008863815        109          0          7          1
    ___lwp_park60          62.720893458        964        251        454        453
                          -------------    -------    -------    -------    -------
                          111.638589870      56098       3944       2563       4628

With my kernel changes landed, the number of missed sce/scx events is down to zero (with exceptions to signals that e.g. never return such as the exit(2) call).

Once these changes settle in HEAD, I plan to backport them to NetBSD-9. I have already received feedback that GDB works much better now.

The kernel also has now more runtime asserts that validate correctness of the code paths.


I've introduced a special preprocessor macro to detect LSan (__SANITIZE_LEAK__) and UBSan (__SANITIZE_UNDEFINED__) in GCC. The patches were submitted upstream to the GCC mailing list, in two patches (LSan + UBSan). Unfortunately, GCC does not see value in feature parity with LLVM and for the time being it will be a local NetBSD specific GCC extension. These macros are now integrated into the NetBSD public system headers, for use by the basesystem software.

The LSan macro is now used inside the LLVM codebase and the ps(1) program is the first user of it. The UBSan macro is now used to disable relaxed alignment on x86. While such code is still functional, it is not clean from undefined behavior as specified by C. This is especially needed in the kernel fuzzing process, as we can reduce noise from less interesting reports.

During the previous month a number of reports from kernel fuzzing were fixed. There is still more to go.

Almost all local patches needed for LSan were merged upstream. The last remaining local patch is scheduled for later as it is very invasive for all platforms and sanitizers. In the worst case we just have more false negatives in detection of leaks in specific scenarios.

Miscellaneous changes

I have fixed a regression in upstream GDB with SIGTTOU handling. This was an upstream bug fixed by Alan Hayward and cherry-picked by me. As a side effect, a certain environment setup would cause the tracer to sleep.

I have reverted the regression in changed in6_addr change. It appeased UBSan, but broke at least qemu networking. The regression was tracked down by Andreas Gustafsson and reported in the NetBSD's bug tracking system.

I have landed a patch that returns ELF loader dl_phdr_info information for dl_iterate_phdr(3). This synchronized the behavior with Linux, FreeBSD and OpenBSD and is used by sanitizers.

I have passed through core@ the patch to change the kevent::udata type from intptr_t to void*. The former is slightly more pedantic, but the latter is what is in all other kevent users and this mismatch of types affected specifically C++ users that needed special NetBSD-only workarounds.

I have marked git and hg meta files as ones to be ignored by cvs import. This was causing problems among people repackaging the NetBSD source code with other VCS software than CVS.

I keep working on getting GDB test-suite to run on NetBSD, I spent some time on getting fluent in the TCL programming language (as GDB uses dejagnu and TCL scripting). I have already fixed two bugs that affected NetBSD users in the TCL runtime: getaddrbyname_r and gethostbyaddr_r were falsely reported as available and picked on NetBSD, causing damage in operation. Fluency in TCL will allow me to be more efficient in addressing and debugging failing tests in GDB and likely reuse this knowledge in other fields useful for the project.

I made __CTASSERT a static assert again. Previously, this homegrown check for compile-time checks silently stopped working for C99 compilers supporting VLA (variable length array). It was caught by kUBSan that detected VLA of dynamic size of -1, that is still compatible but has unspecified runtime semantics. The new form is inspired by the Perl ctassert code and uses bit-field constant that enforces the assert to be effective again. Few misuses __CTASSERT, mostly in the Linux DRMKMS code, were fixed.

I have submitted a proposal to the C Working Group a proposal to add new methods for setting and getting the thread name.

Plan for the next milestone

Keep stabilizing the reliability debugging interfaces and get ATF and LLDB threading code reliably pass tests. Cover more scenarios with ptrace(2) in the ATF regression test-suite.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can: [0 comments]


Post a Comment:
  • HTML Syntax: NOT allowed