NetBSD: the first BSD introducing a modern process plugin framework in LLDB
A feature set for debugging NetBSD applications (without threads) has been merged with upstream LLDB! The number of passing tests this month has been increased from 267/1235 to 622/1247. This is +133% within one month and approximately 50% of successfully passed tests in total! As usual regular housekeeping of ptrace(2) interfaces has been done on the NetBSD side.
During this month I've finished the needed Native Process Plugin with breakpoints fully supported. In order to achieve this I had to address bugs, add missing features and diligently debug the debugger sniffing on the GDB Remote Protocol line. Since NetBSD-8 is approaching, I have performed the also needed housekeeping on the base system distribution side.
What has been done in NetBSD
I've managed to achieve the following goals:
Clean up in ptrace(2) ATF tests
We have created some maintanance burden for the current ptrace(2) regression tests. The main issues with them is code duplication and the splitting between generic (Machine Independent) and port-specific (Machine Dependent) test files. I've eliminated some of the ballast and merged tests into the appropriate directory tests/lib/libc/sys/. The old location (tests/kernel) was a violation of the tests/README recommendation:
When adding new tests, please try to follow the following conventions. 1. For library routines, including system calls, the directory structure of the tests should follow the directory structure of the real source tree. For instance, interfaces available via the C library should follow: src/lib/libc/gen -> src/tests/lib/libc/gen src/lib/libc/sys -> src/tests/lib/libc/sys ...
The ptrace(2) interface inhabits src/lib/libc/sys so there is no reason not to move it to its proper home.
PTRACE_FORK on !x86 ports
Along with the motivation from Martin Husemann we have investigated the issue with PTRACE_FORK ATF regression tests. It was discovered that these tests aren't functional on evbarm, alpha, shark, sparc and sparc64 and likely on other non-x86 ports. We have discovered that there is a missing SIGTRAP emitted from the child, during the fork(2) handshake. The proper order of operations is as follows:
- parent emits SIGTRAP with si_code=TRAP_CHLD and pe_set_event=pid of forkee
- child emits SIGTRAP with si_code=TRAP_CHLD and pe_set_event=pid of forker
Only the x86 ports were emitting the second SIGTRAP signal.
The culprit reason has been investigated and narrowed down to the child_return() function in src/sys/arch/x86/x86/syscall.c:
void child_return(void *arg) { struct lwp *l = arg; struct trapframe *tf = l->l_md.md_regs; struct proc *p = l->l_proc; if (p->p_slflag & PSL_TRACED) { ksiginfo_t ksi; mutex_enter(proc_lock); KSI_INIT_EMPTY(&ksi); ksi.ksi_signo = SIGTRAP; ksi.ksi_lid = l->l_lid; kpsignal(p, &ksi, NULL); mutex_exit(proc_lock); } X86_TF_RAX(tf) = 0; X86_TF_RFLAGS(tf) &= ~PSL_C; userret(l); ktrsysret(SYS_fork, 0, 0); }
This child_return() function was the only one among all the existing ones for other platforms to contain the needed code for SIGTRAP. The appropriate solution was installed by Martin, as we taught our featured signal routing subsystem to handle early signals from the fork(2) calls.
PT_SYSCALL and PT_SYSCALLEMU
Christos Zoulas addressed the misbehavior with tracing syscall entry and syscall exit code. We can again get an event inside a debugger that a debuggee attempts to trigger a syscall and later return from it. This means that a debugger can access the register layout before executing the appropriate kernel code and read it again after executing the syscall. This allows to monitor exact sysentry arguments and return the values afterwards. Another option is to fake the trap frame with new values, it's sometimes useful for debugging.
With the addition of PT_SYSCALLEMU we can implement a virtual kernel syscall monitor. It means that we can fake syscalls within a debugger. In order to achieve this feature, we need to use the PT_SYSCALL operation, catch SIGTRAP with si_code=TRAP_SCE (syscall entry), call PT_SYSCALLEMU and perform an emulated userspace syscall that would have been done by the kernel, followed by calling another PT_SYSCALL with si_code=TRAP_SCX.
This interface makes it possible to introduce the following into NetBSD: truss(1) from FreeBSD and strace(1) from Linux. There used to be a port of at least strace(1) in the past, but it's time to refresh this code. Another immediate consumer is of course in DTrace/libproc... as there are facilities within this library to trace the system call entry and exit in order to catch fork(2) events. Why to catch fork(2)? It can be useful to detach software breakpoints in order to detach them before cloning address space of forker for forkee; and after the operation reapply them again.
What has been done in LLDB
A lot of work has been done with the goal to get breakpoints functional. This target penetrated bugs in the existing local patches and unveiled missing features required to be added. My initial test was tracing a dummy hello-world application in C. I have sniffed the GDB Remote Protocol packets and compared them between Linux and NetBSD. This helped to streamline both versions and bring the NetBSD support to the required Linux level.
As a bonus the initial code for OpenBSD support was also added into the LLDB tree. At the moment OpenBSD only supports opening core(5) files with a single-thread. The same capability was also added for NetBSD.
By the end of March all local patches for LLDB were merged upstream! This resulted in NetBSD being among the first operating systems to use a Native Process Plugin framework with the debugserver capability, alongside Linux & Android. The current FreeBSD support in LLDB is dated and lagging behind and limited to local debugging. A majority of the work to bring FreeBSD on par could well be a case of s/NetBSD/FreeBSD/.
Among the features in the upstreamed NetBSD Process Plugin:
- handling software breakpoints,
- correctly attaching to a tracee,
- supporting NetBSD specific ptrace(2),
- monitoring process termination,
- monitoring SIGTRAP events,
- monitoring SIGSTOP events,
- monitoring other signals events,
- resuming the whole process,
- getting memory region info perms,
- reading memory from tracee,
- writing memory to tracee,
- reading ELF AUXV,
- x86_64 GPR reading and writing,
- detecting debuginfo of the basesystem programs located in /usr/libdata/debug
- adding single step support,
- adding execve(2) trap support,
- placeholder for Floating Point Registers code,
- initial code for the NetBSD specific core(5) files,
- enabling ELF Aux Vector reading on the lldb client side,
- enabling QPassSignals feature for NetBSD on the lldb client side,
- enabling ProcessPOSIXLog on NetBSD,
- minor tweaking.
Demo
It's getting rather difficult to present all the features of the NetBSD Process Plugin without making the example overly long. This is why I will restrict it to a very basic debugging session hosted on NetBSD.
$ cat crashme.c #includeint main(int argc, char **argv) { int i = argv; while (i-- != 0) printf("argv[%d]=%s\n", i, argv[i]); return 0; } $ gcc -w -g -o crashme crashme.c # -w disable all warnings $ ./crashme Memory fault (core dumped) chieftec$ lldb ./crashme (lldb) target create "./crashme" Current executable set to './crashme' (x86_64). (lldb) r Process 612 launched: './crashme' (x86_64) Process 612 stopped * thread #1, stop reason = signal SIGSEGV: address access protected (fault address: 0x7f7ffec88518) frame #0: 0x000000000040089c crashme`main(argc=1, argv=0x00007f7fffdd6420) at crashme.c:9 6 int i = argv; 7 8 while (i-- != 0) -> 9 printf("argv[%d]=%s\n", i, argv[i]); 10 11 return 0; 12 } (lldb) frame var (int) argc = 1 (char **) argv = 0x00007f7fffdd6420 (int) i = -2268129 (lldb) # i looks wrong, checking argv... (lldb) p *argv (char *) $0 = 0x00007f7fffdd6940 "./crashme" (lldb) # set a brakpoint and restart (lldb) b main Breakpoint 1: where = crashme`main + 15 at crashme.c:6, address = 0x000000000040087f (lldb) r There is a running process, kill it and restart?: [Y/n] y Process 612 exited with status = 6 (0x00000006) got unexpected response to k packet: Sff Process 80 launched: './crashme' (x86_64) Process 80 stopped * thread #1, stop reason = breakpoint 1.1 frame #0: 0x000000000040087f crashme`main(argc=1, argv=0x00007f7fff3d17c8) at crashme.c:6 3 int 4 main(int argc, char **argv) 5 { -> 6 int i = argv; 7 8 while (i-- != 0) 9 printf("argv[%d]=%s\n", i, argv[i]); (lldb) frame var (int) argc = 1 (char **) argv = 0x00007f7fff3d17c8 (int) i = 0 (lldb) n Process 80 stopped * thread #1, stop reason = step over frame #0: 0x0000000000400886 crashme`main(argc=1, argv=0x00007f7fff3d17c8) at crashme.c:8 5 { 6 int i = argv; 7 -> 8 while (i-- != 0) 9 printf("argv[%d]=%s\n", i, argv[i]); 10 11 return 0; (lldb) p i (int) $1 = -12773432 (lldb) # gotcha i = argv, instead of argc! (lldb) bt * thread #1, stop reason = step over * frame #0: 0x0000000000400886 crashme`main(argc=1, argv=0x00007f7fff3d17c8) at crashme.c:8 frame #1: 0x000000000040078b crashme`___start + 229 (lldb) disassemble --frame --mixed 4 main(int argc, char **argv) ** 5 { crashme`main: 0x400870 <+0>: pushq %rbp 0x400871 <+1>: movq %rsp, %rbp 0x400874 <+4>: subq $0x20, %rsp 0x400878 <+8>: movl %edi, -0x14(%rbp) 0x40087b <+11>: movq %rsi, -0x20(%rbp) ** 6 int i = argv; 7 0x40087f <+15>: movq -0x20(%rbp), %rax 0x400883 <+19>: movl %eax, -0x4(%rbp) -> 8 while (i-- != 0) -> 0x400886 <+22>: jmp 0x4008b3 ; <+67> at crashme.c:8 ** 9 printf("argv[%d]=%s\n", i, argv[i]); 10 0x400888 <+24>: movl -0x4(%rbp), %eax 0x40088b <+27>: cltq 0x40088d <+29>: leaq (,%rax,8), %rdx 0x400895 <+37>: movq -0x20(%rbp), %rax 0x400899 <+41>: addq %rdx, %rax 0x40089c <+44>: movq (%rax), %rdx 0x40089f <+47>: movl -0x4(%rbp), %eax 0x4008a2 <+50>: movl %eax, %esi 0x4008a4 <+52>: movl $0x400968, %edi ; imm = 0x400968 0x4008a9 <+57>: movl $0x0, %eax 0x4008ae <+62>: callq 0x400630 ; symbol stub for: printf ** 8 while (i-- != 0) 0x4008b3 <+67>: movl -0x4(%rbp), %eax 0x4008b6 <+70>: leal -0x1(%rax), %edx 0x4008b9 <+73>: movl %edx, -0x4(%rbp) 0x4008bc <+76>: testl %eax, %eax 0x4008be <+78>: jne 0x400888 ; <+24> at crashme.c:9 ** 11 return 0; 0x4008c0 <+80>: movl $0x0, %eax ** 12 } 0x4008c5 <+85>: leave 0x4008c6 <+86>: retq (lldb) version lldb version 5.0.0 (http://llvm.org/svn/llvm-project/lldb/trunk revision 299109) (lldb) platform status Platform: host Triple: x86_64-unknown-netbsd7.99 OS Version: 7.99.67 (0799006700) Kernel: NetBSD 7.99.67 (GENERIC) #1: Mon Apr 3 08:09:29 CEST 2017 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC Hostname: 127.0.0.1 WorkingDir: /public/lldb_devel Kernel: NetBSD Release: 7.99.67 Version: NetBSD 7.99.67 (GENERIC) #1: Mon Apr 3 08:09:29 CEST 2017 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC (lldb) # thank you! (lldb) q Quitting LLDB will kill one or more processes. Do you really want to proceed: [Y/n] y
Plan for the next milestone
I've listed the following goals for the next milestone.
- watchpoints support,
- floating point registers support,
- enhance core(5) and make it work for multiple threads
- introduce PT_SETSTEP and PT_CLEARSTEP in ptrace(2)
- support threads in the NetBSD Process Plugin
- research F_GETPATH in fcntl(2)
Beyond the next milestone is x86 32-bit support.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:
http://netbsd.org/donations/#how-to-donate [0 comments]