Forking fixes in the context of debuggers
For the past month I've been mostly working on improving the kernel code in the ptrace(2) API. Additionally, I've prepared support for reading NetBSD/aarch64 core(5) files.
LLDB
I've rebased my old patches for handling NetBSD core(5) files to recent LLDB from HEAD (7.0svn). I had to implement the support for aarch64, as requested by Joerg Sonnenberger for debugging NetBSD/arm64 bugs. The NetBSD case is special here, as we set general purpose registers in 128bit containers, while other OSes prefer 64-bit ones. I had to add a local kludge to cast the interesting 64-bit part of the value.
I've generated a local prebuilt toolchain with lldb prepared for a NetbBSD/amd64 host and shared with developers.
Debug Registers
I've improved the security of Debug Registers, with the following changes:
- x86_dbregs_setup_initdbstate() stores the initial status of Debug Registers, I've explicitly disabled some of their bits for further operation. This register context is used when initializing Debug Registers for a userland thread. This is a paranoid measure used just in case a bootloader uses them and leaves the bits lit up.
- I've added a new sysctl(3) switch security.models.extensions.user_set_dbregs: "Whether unprivileged users may set CPU Debug Registers". The Debug Registers use privileged operations and we decided to add another check, disabling them by default. There is no regress in the elementary functionality and a user can still read the Debug Register context.
- This new sysctl(3) functionality has been covered by ATF checks in existing tests (i386 and amd64 for now).
fork(2) and vfork(2)
I've pushed various changes in the kernel subsystems and ATF userland tests for the ptrace(2) functionality:
- I've integrated all fork(2) and vfork(2) tests into the shared function body. They tests PTRACE_FORK, PTRACE_VFORK and PTRACE_VFORK_DONE. There are now eight fork1..fork8 and eight vfork1..vfork8 tests (total 16). I'm planning to cover several remaining corner case scenarios in the forking code.
- I've removed an unused variable from linux_sys_clone(), which calls fork1(9). This provides an opportunity to refactor the generic code.
- I've refactored start_init() setting inside it the initproc pointer. This eliminates the remaining user of rnewprocp in the fork1(9) API.
- I've removed the rnewprocp argument from fork1(9). This argument in the current implementation could cause use-after-free in corner cases.
- PTRACE_VFORK has been implemented and made functional.
Security hardening
I've prohibited calling PT_TRACE_ME for a process that is either initproc (PID1) or a direct child of PID1.
I've prohibited calling PT_ATTACH from initproc. This shouldn't happen in normal circumstances, but just in case this would lead to invalid branches in the kernel.
With the above alternations, I've removed a bug causing a creation of a process that is not debuggable. It's worth to note that this bug still exists in other popular kernels. A simple reproducer for pre-8.0 and other OSes using ptrace(2) (Linux, most BSDs ...):
$ cat antidebug.c #include <sys/types.h> #include <sys/ptrace.h> #include <stdlib.h> #include <unistd.h> #include <stdio.h> #include <errno.h> int main(int argc, char **argv) { pid_t child; int rv; int n = 0; child = fork(); if (child == 0) { while (getppid() != 1) continue; rv = ptrace(PT_TRACE_ME, 0, 0, 0); if (rv != 0) abort(); printf("Try to detach to me with a debugger!! "); printf("haha My PID is %d\n", getpid()); while (1) { printf("%d\n", n++); sleep(1); } } exit(0); }
Additionally it's no longer valid calling and returning success PT_TRACE_ME when a process is already traced.
These security changes are covered by new ATF ptrace(2) tests:
- traceme_pid1_parent - Assert that a process cannot mark its parent a debugger twice.
- traceme_twice - Verify that PT_TRACE_ME is not allowed when our parent is PID1.
Other ptrace(2) enhancements
A list of other changes:
- I've refactored part of the existing ATF ptrace(2) tests, focusing on code deduplication and covering more corner cases. I'm now especially testing PT_TRACE_ME existing scenarios with signals from various categories: SIGSTOP, SIGCONT, SIGKILL, SIGABRT, SIGHUP. Some signals are non-maskable ones, some generate core(5) files etc. They might use different kernel paths and I'm now covering them in a more uniform and hopefully more maintainable form.
-
In the kernel, I've removed unused branch in the proc_stop_done() and sigswitch() functions.
This functionality was used in deprecated filesystem tracing feature (/proc).
It prevented emitting child signals to a debugging program, namely with the SIGCHLD signal.
The modern solution to perform tracing without signals in a debugger is to spawn a debugging server and outsource the tracing functionality to it. This is done in software like gdb-server, lldb-server etc.
-
I've removed a stray XXX comment from I and D read/write operations, and added a proper explanation:
+ /* + * The I and D separate address space has been inherited from PDP-11. + * The 16-bit UNIX started with a single address space per program, + * but was extended to two 16-bit (2 x 64kb) address spaces. + * + * We no longer maintain this feature in maintained architectures, but + * we keep the API for backward compatiblity. Currently the I and D + * operations are exactly the same and not distinguished in debuggers. + */
- I've improved the WCOREDUMP() checker in the ATF tests, casting it to either 1 or 0. This helps to handler correctly the tests generating core(5) files.
- Improve of the proc_stoptrace() function.
proc_stoptrace() is dedicated for emitting a syscall trap for a debugger,
either on entry or exit of the system function routine.
Changes:
- Change an if() branch of an invalid condition of being traced by initproc (PID1) to KASSERT(9).
- Assert that the current process has set appropriate flags (PSL_TRACED and PSL_SYSCALL).
- Use ktrpoint(KTR_PSIG) and ktrpsig()/e_ktrpsig() in order to register the emitted signal for the ktrace(1) event debugging.
Summary
A critical Problem Report kern/51630 regarding lack of PTRACE_VFORK implementation has been fixed. This means that there are no other unimplemented API calls, but there are still bugs in the existing ones.
With fixes and addition of new test cases, as of today we are passing 961 ptrace(2) tests and skipping 1 (out of 1018 total).
Plan for the next milestone
Cover the remaining forking corner-cases in the context of debuggers with new ATF tests and fix the remaining bugs.
The first step is to implement proper support for handling PT_TRACE_ME-traced scenarios from a vfork(2)ed child. Next I plan to keep covering the corner cases of the forking code and finish this by removal of subtle bugs that are still left in the code since the SMP'ification.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:
http://netbsd.org/donations/#how-to-donate [0 comments]