NetBSD Blog

Bookmarks

Feeds

Google Summer of Code 2025 Reports: Asynchronous I/O Framework

August 30, 2025 posted by Leonardo Taccari

This report was written by Ethan Miller as part of Google Summer of Code 2025.

Introduction

The goal is to improve the capabilities of asynchronous IO within NetBSD. Originally the project espoused a model that pinned a single worker thread to each process. That thread would iterate over pending jobs and complete blocking IO. From this, the logical next step was to support an arbitrary number of worker threads. Each process now has a pool of workers recycled from a freelist, and jobs are grouped per-file so that we do not thrash multiple threads on the same vnode which would inevitably lock. This grouping also opens the door for future optimisations in concurrency. The guiding principle is to keep submission cheap, coalesce work sensibly, and only spawn threads when the kernel would otherwise block.

Project Details and Status

We pin what is referred to as a service pool to each process, with each service pool capable of spawning and managing service threads. When a job is enqueued it is distributed to its respective service thread. For regular files we coalesce jobs that act on the same vnode into one thread. If we fall back to the synchronous IO path within the kernel it would lock anyway, but this approach is prudent because if more advanced concurrency optimisations such as VFS bypass are implemented later this is precisely the model that would be required. At present, since that solution is not yet in place, all IO falls back to the synchronous pipeline. Even so there are performance gains when working with different files, since synchronous IO can still run on separate vnodes at the same time.

Through the traditional VFS read/write path, requests eventually reach bread/bwrite and block upon a cache miss until completion. This kills concurrency. I considered a solution that bypassed the normal vnode read/write path by translating file offsets to device LBAs with VOP_BMAP, constructing block IO at the buffer and device layer, submitting with B_ASYNC, and deferring the wait to the AIO layer with biodone bookkeeping instead of calling biowait at submission. This keeps submission short and releases higher level locks before any device wait. The assumptions are that filesystem metadata is frequently accessed therefore cached so VOP_BMAP usually does not block, that block pointers for an inode mostly remain stable for existing data, and that truncation does not rewrite past data. For the average case this would provide concurrency on the same file. In practice, however, it was exceptionally difficult to implement because the block layer lacks the necessary abstractions.

This is, however, exactly the solution espoused by FreeBSD, and they make it work well because struct bio is an IO token independent of the page and buffer cache. GEOM can split or clone a bio, queue them to devices, collect child completions, and run the parent callback. Record locks are treated as advisory so once a bio is in flight the block layer completes it even if the advisory state changes. NetBSD has no equivalent token. Struct buf is both a cache object and an IO token tied to UBC and drivers through biodone and biowait. For now the implementation of service pools and service threads lays the groundwork for asynchronous IO. Once the BIO layer reaches adequate maturity, integrating a bio-like abstraction will be straightforward and yield immediate improvements for concurrency on the same vnode. The logical next step is to design and port something comparable to FreeBSDs struct bio which would map very cleanly onto the current POSIX AIO framework.

My Development Environment

My development setup is optimised for building and testing quickly. I use scripts to cross-build the kernel and boot it under QEMU with a small FFS root. The kernel boots directly with the QEMU option -kernel without any supporting bootloader. Early on I tested against a custom init dropped onto an FFS image. Now I do the same except init simply launches a shell which allows me to run ATF tests without a full distribution. This makes it possible to compile a new kernel and run tests within seconds.

Lessons

One lesson I have taken away is that progress never happens overnight. It takes enormous effort to get even a few thousand lines of highly multi-threaded race-prone code to behave consistently under all conditions. Precision in implementation is absolutely required. My impression of NetBSD is that it is a fascinating project with an abundance of seemingly low-hanging fruit. In reality none of it is truly low-hanging or simple, but compared to Linux there remains a great deal of work to be done. It is not easy work but the problems are visible and the path forward is clearer.

I also want to note that I intend on providing long term support for this code in the case that any issues may arise.

The code written as part of this project can be found here.

[1 comment]

« NetBSD 11.0 release... | Main | Google Summer of... »

Comments:

A very interesting update. Great work!

Posted by bbartlomiej on August 30, 2025 at 09:44 PM UTC #