NetBSD Blog

Annual General Meeting 2026

2026-06-06T16:01:52+00:00

Today, the NetBSD Foundation had an open annual general meeting in a public IRC channel. It began with presentations, and was followed by a Q&A session where we took questions from the public. Here's the full log.

<leot> OK, we are about to start... sorry for the delay!
-!- mode/#netbsd-agm [+m] by leot
-!- mode/#netbsd-agm [+o Cryo] by leot
-!- mode/#netbsd-agm [+v Cryo] by leot
 * Cryo turns the lights down
<leot> Hello everyone!
<Cryo> I'll start off by thanking you for coming.
<Cryo> and handing it to leot!
<leot> Thanks Cryo and thanks for coming!
<leot> .
<leot> Welcome to The NetBSD Foundation Annual General Meeting 2026!
<leot> .
<leot> In the agenda we will have reports from:
<leot> .
<leot> - board (<billc>)
<leot> - core (<kre>)
<leot> - admins (<spz>)
<leot> - finance-exec (<riastradh>)
<leot> - membership-exec (<martin>)
<leot> - releng (<martin>)
<leot> - security-team (<martin>)
<leot> - pkgsrc-pmc (<wiz>)
<leot> - pkgsrc-security (<tm>)
<leot> .
<leot> If there are any last-minute additions please /msg me!
<leot> .
<leot> The Q&A session will be at the end of all the presentations.
<leot> .
<leot> When Q&A begins please /msg me with "I have question for <team>" or
<leot> "I have question for <nick>" and I will give you voice when it is
<leot> your turn.
<leot> .

<leot> Next presentation is prepared by <billc> and board@ for board!
<leot> I will present on <billc> behalf
<leot> .

<leot> Welcome to the 24th Annual General Meeting of The NetBSD Foundation.
<leot> .
<leot> 2025 progress:
<leot> - Recent stable releases include NetBSD 10.1 (Dec 2024), 9.4, and 9.3
<leot> - Development is currently focused on the imminent transition to NetBSD-11 [RC4]
<leot> .
<leot> We are preparing for:
<leot> - BSDcan in Ottawa, Canada
<leot> - The ISF Common Good Cyber Fund (CGCG) application window, which runs
<leot>   from June 23 to August 4, 2026.
<leot> .
<leot> We recognize that different avenues may well be available to us regarding grants
<leot> and funding, and we are looking for volunteers to help us investigate, apply,
<leot> and deliver for the programs available. This includes, but is not limited to,
<leot> potential opportunities from the Internet Society, the Linux Foundation through
<leot> Alpha-Omega, Germany's Sovereign Tech Agency and Prototype Fund, or grants from
<leot> the European Union through NLnet.
<leot> .
<leot> The NetBSD Foundation Board of Directors presents a consolidated list
<leot> of the relevant and major actions that occurred since last AGM.
<leot> Quite a few discussions, actions, and follow-ups crossed multiple meetings.
<leot> Very few meetings resulted in not reaching quorum.
<leot> During this period, new director(s) were elected by the members and
<leot> officers were renewed or installed.
<leot> We continued with our Bronze level sponsorship support of BSDcan,
<leot> AsiaBSDcon, and EuroBSDcon to improve our representation at conferences
<leot> and developer summits.
<leot> .
<leot> We participated in the Google Summer of Code for 2025 and we attended
<leot> the Google Summer of Code Mentor Summit in Munich, Germany.
<leot> We are currently participating in GSoC this year with 5 students!
<leot> .
<leot> For 2025, these are the projects that passed:
<leot> - Asynchronous I/O Framework
<leot> - Using bubblewrap to add sandboxing to NetBSD
<leot> - Enhancing Support for NAT64 Protocol Translation in NetBSD
<leot> .
<leot> For 2026, these projects have been chosen:
<leot> - Improving and Stabilizing the racoon2 IKE Daemon in NetBSD
<leot> - Port the Enlightenment desktop environment to NetBSD
<leot> - improving RAIDframe
<leot> - Testing Compat Linux: Syscall testing
<leot> - Convert a Wi-Fi driver to the new Wi-Fi stack
<leot> .
<leot> We continued to improve our interaction and relationships with
<leot> vendors, as well as participating in industry PSIRT/CSIRT
<leot> with commercial vendors and other open-source projects.
<leot> .
<leot> We successfully completed the large-scale migration of our repository
<leot> infrastructure from CVS to a Git/Mercurial ecosystem, including the
<leot> launch of live hgweb and gitweb test environments.
<leot> .
<leot> We also advanced our security and compliance posture by initiating CNA (CVE Numbering
<leot> Authority) onboarding with MITRE and ensure readiness for the EU Cyber
<leot> Resilience Act (CRA).
<leot> .
<leot> We also implemented "Anti-Slop" protocols to protect codebase integrity
<leot> against code not written by humans.
<leot> .
<leot> The funded contracts continued for:
<leot> - improvements in release engineering
<leot> .
<leot> We are 12% through a fundraising campaign. *Please* consider
<leot> donating, as we are a US IRS 501(c)3 charitable organization.
<leot> .
-!- mode/#netbsd-agm [+v krelz] by leot
<leot> EOF

<leot> Next in the agenda we have... core@ presentation by <kre>! krelz, please go ahead!

<krelz> Hi everyone, before I begin, any other core members who want to add something
<krelz> to what I am about to present, msg leot and I'm sure you can be snuck in when I
<krelz> am done, which won't take long...
<krelz> .
<krelz> Report from core for 2026 NetBSD AGM
<krelz> .
<krelz> core is tasked with technical management of the NetBSD project.
<krelz> .
<krelz> The current members of core are:
<krelz> .
<krelz>         Christos Zoulas         christos@
<krelz>         Chuck Silvers           chs@
<krelz>         Robert Elz              kre@
<krelz>         Martin Husemann         martin@
<krelz>         Matthew Green           mrg@
<krelz>         Taylor R Campbell       riastradh@
<krelz>         Rin Okuyama             rin@
<krelz> .
<krelz> Actual technical management is difficult in a volunteer project,
<krelz> as developers work on whatever interests them.   One aspect
<krelz> which is sometimes important is in settling disputes between
<krelz> developers.   Fortunately there was only one such dispute in
<krelz> the past year, which was easily amicably settled.
<krelz> .
<krelz> Core doesn't hold regular formal meetings, issues are discussed
<krelz> when they arise, otherwise we're mostly fairly dormant.
<krelz> .
<krelz> core, as a group, can be reached at core@netbsd.org
<krelz> .
<krelz> That's it for me, for this year's core report, I will be here for questions later
<krelz> .

-!- mode/#netbsd-agm [-v krelz] by leot
<leot> Thank you kre!
<leot> Next in the agenda... we have the admins@ presentation from spz! Please go ahead!
-!- mode/#netbsd-agm [+v spz] by leot

<spz> good localtime() all
<spz> ,
<spz> admins is the following people:
<spz> christos, dogcow, kim, mspo, phil, riastradh, riz, seb, soda, spz, tls
<spz> ,
<spz> Statistics:
<spz> - admins runs the following TNF systems:
<spz> @ TastyLime
<spz> + 8 hardware systems, 6 'regular' Xen guests and 3 repotest Xen guests
<spz> = 1 earmv7hf, the rest amd64
<spz> - public services, the repo(s), sundry
<spz> @ AOA
<spz> + 6 hardware systems
<spz> = all amd64
<spz> - the NetBSD build farm
<spz> @ Washington University
<spz> + 7 hardware systems
<spz> = 2 aarch64 and the rest amd64
<spz> - two pkg builders, the repo conversion and a CI system, sundry
<spz> @ Regensburg
<spz> + 2 hardware systems, one of them with 2 Xen guests
<spz> = all amd64 (+ a sparc64 serving consoles)
<spz> - the offsite backup, archive, wip.pkgsrc.org and a CI system
<spz> ,
<spz> - CDN services donated by Fastly
<spz> - Housing donated by TastyLime, Two Sigma, WWU, and spz
<spz> ,
<spz> NetBSD versions in use:
<spz> 6   10.0_STABLE (1 earmv7hf, 1 aarch, 4 amd64)
<spz> 6   10.1 (5 amd64)
<spz> 13  10.1_STABLE (13 amd64)
<spz> 1   11.0_RC3 (amd64)
<spz> 2   11.0_RC4 (aarch64, amd64)
<spz> ,
<spz> Changes:
<spz> Riastradh spent even more time on the mail system so we can still send
<spz> mail to Google mail accounts.
<spz> Also Riastradh has been developping the future reposerver setup.
<spz> ,
<spz> Notable issues:
<spz> - spam suppression by technical means, which makes life hard(er) for
<spz>   legitimate mailing lists and hasn't stopped spammers (or phishing) yet.
<spz> - LLM scraping. Anti-social "all your resources are belong to us",
<spz>   total disregard for robots.txt, and backed by lots of money
<spz>   so they can buy all the shady "residential proxies" they want,
<spz>   so IP blacklists aren't feasible. Their capacity to scrape
<spz>   vastly outnumbers our capacity to serve, they are very aggressive,
<spz>   so the chance of a human getting to use the wip.pkgsrc.org website
<spz>   is slim. And to add insult to injury they could just download the repo
<spz>   instead of diffing every possible version of every file against every
<spz>   other version via the web interface:
<spz>   the point is they don't want to use resources carefully, because that
<spz>   would require thought and LLMs are all about not expending that.
<spz>   (if you detect some foaming at the mouth here: yes. aarrrggghhh!)
<spz> ,
<spz>   We are very sorry but we'll have to add countermeasures like for
<spz>   archive.NetBSD.org, if possibly not the same, or shut the
<spz>   web interface down like we did with the cvsweb access to wikisrc.
<spz> ,
<spz>   The other NetBSD web sites survive thanks to having a limited
<spz>   number of links (the scrapers only visit each one twice a day),
<spz>   and CDN caching.
<spz> - hardware aging at TastyLime, both the TNF servers and the network
<spz>   equipment. The latter is being dealt with, there will be a downtime
<spz>   for sometime soon. The former suffers from requiring half a week time
<spz>   on-site and roughly two weeks from off-site to get anything working
<spz>   properly again, and the activation energy required to do that is a lot.
<spz> - the perennially full /pub/pkgsrc/packages on ftp.NetBSD.org.
<spz>   we have a plan, it "just" needs implementing.
<spz> ,
<spz> We often get asked:
<spz> - why don't you use a Cloud provider or rent servers instead:
<spz>         + did that with the offsite backup server, the provider ceased
<spz>         operations and just shut everything off: our data? sucks to be us.
<spz>         If we own the server it might get switched off, but we could get
<spz>         it (and thus our data) back
<spz>         + but you could have backups: having 50TB in total backed up and
<spz>         not paying an arm and a leg for retrieval and making certain the
<spz>         backup provider isn't going funny is either expensive or difficult
<spz>         + in the long run renting servers is not cheaper if they actually
<spz>         are busy all the time
<spz>         + we should always consider if this (whatever this) is a good use
<spz>         of TNF funds.
<spz> - we could sponsor you a server
<spz>         Thanks, kind of you to offer. However:
<spz>         If it's just one server we couldn't do OS updates. Having IPMI
<spz>         on the open Internet for console access is not a good security
<spz>         stance. Thus we are at a server and a console server, and having
<spz>         a console server and several servers just scales better.
<spz>         Plus we'd like to have at least one member of admins in viable
<spz>         site visit distance and an expectation of duration: site moves
<spz>         aren't much less work than hardware renewals.
<spz> ,
<spz> Thanks to riz, tls and phil for their resources, time
<spz> and blood sacrifices, too. :}
<spz> ,
<spz> Back to moderator.

-!- mode/#netbsd-agm [-v spz] by leot
<leot> Thank you very much spz!
<leot> Next in the agenda we have... Riastradh with the finance-exec@ presentation!
-!- mode/#netbsd-agm [+v Riastradh] by leot

<Riastradh> Hi, folks!
<Riastradh> Finance-exec hoards the cash, keeps the books, sends
<Riastradh> thank-you notes to donors, and pays out contracts and
<Riastradh> reimbursements.
<Riastradh> .
<Riastradh> We are:
<Riastradh> - christos (Christos Zoulas)
<Riastradh> - reed (Jeremy C Reed)
<Riastradh> - riastradh (Taylor R Campbell)
<Riastradh> .
<Riastradh> The NetBSD Foundation's public 2025 financial report is at:
<Riastradh> https://www.NetBSD.org/foundation/reports/financial/2025.html
<Riastradh> We produce this from an internal ledger maintained with
<Riastradh> ledger(1) <https://www.ledger-cli.org/>.
<Riastradh> .
<Riastradh> Highlights:
<Riastradh> - We have net assets of a little over 400k USD as of today
<Riastradh>   (we received a large donation in 2026).
<Riastradh> - In 2025, we received about 80k USD -- far surpassing our
<Riastradh>   usual donation target of 50k USD!
<Riastradh> - We spent 21k USD, mainly on:
<Riastradh>   o supporting conferences and sending developers to them
<Riastradh>   o release engineering
<Riastradh> .
<Riastradh> That was a lot more income and a lot less expenses than we
<Riastradh> usually have.  But forecasting:
<Riastradh> - We expect to purchase some more hardware replacements this
<Riastradh>   year, and components like RAM have gotten much more
<Riastradh>   expensive recently.
<Riastradh> - We have more funds for funded projects now, and while core
<Riastradh>   or pkgsrc-pmc directs the funds, they're really driven by
<Riastradh>   the developer proposals that are available -- so if you
<Riastradh>   want to work on a funded project, send a proposal!
<Riastradh> .
<Riastradh> Happy to answer any questions about what finance-exec does,
<Riastradh> or swap notes on using ledger(1)!
<Riastradh> Thanks,
<Riastradh> -Riastradh, on behalf of finance-exec

<leot> Thanks a lot Riastradh!
<leot> Next presentation is from <martin> with the membership-exec@ presentation!
-!- mode/#netbsd-agm [+v __martin] by leot

<__martin> thanks
<__martin> The current members of membership-exec are:
<__martin> - Christos Zoulas <christos>
<__martin> - Martin Husemann <martin>
<__martin> - Lex Wennmacher <wennmach>
<__martin> - Thomas Klausner <wiz>, and
<__martin> - Ken Hornstein <kenh> who is on sabbatical.
<__martin>  -
<__martin> Membership-exec is responsible for all aspects of
<__martin> "membership", but in practice the main task is to handle
<__martin> membership applications. The number of active developers
<__martin> (as of 2026-06-06) is 138. Note that this number is a
<__martin> bit outdated, as the membership activity validation process
<__martin> required for the board election has not yet happened.
<__martin>  -
<__martin> Since the last AGM on 2025-05-17 we gained only 5 new
<__martin> developers, which is (again) way too few. We need to invite
<__martin> more people, please help active users and encourage them to
<__martin> apply.
<__martin>  -
<__martin> The difference between developers and active developers
<__martin> is explained in the bylaws - an active developer has
<__martin> actually committed something in the last year, or contributed
<__martin> in an active way, like admins.
<__martin>  -
<__martin> We'd like to emphasize that we appreciate all your replies
<__martin> to our membership RFC e-mails, although we do not usually
<__martin> acknowledge them. Please keep on providing feedback to
<__martin> the RFC mails.
<__martin> thanks, back to moderator

<leot> Thank you Martin!
<leot> Next presentation... again from Martin but this time with the releng@ hat! :) Please go ahead __martin!

<__martin> hi again
<__martin> We are:
<__martin> abs agc bouyer he jdc martin msaitoh phil reed riz
<__martin> sborrill snj
<__martin>    -
<__martin> Since the last meeting, we have:
<__martin>  o Branched netbsd-11.
<__martin>  o Not released any formal release (only four release
<__martin>    candidates for 11.0).
<__martin>  o Processed hundreds of pullup requests.
<__martin>  o Streamlined the process of cutting a release.
<__martin>    -
<__martin> Currently we are about to release the fifth (and
<__martin> this time definitvely last) release candidate for 11.0.
<__martin> 11.0 has had bad luck with security updates of 3rd party
<__martin> components last minute and slow progress on making
<__martin> these components updatable on relelase branches
<__martin> (like libssh moving to /usr/lib/private/).
<__martin>    -
<__martin> We have only two issues open for 11.0:
<__martin>  (1) the missing unbound import (catch-up to current)
<__martin>  (2) a new expat release that has not made it into
<__martin>      -current, but fixes a few security issues
<__martin>    -
<__martin> Volunteers are welcome to help with both - please
<__martin> contact me directly if you have some time to help.
<__martin>    -
<__martin> I hope to cut RC5 later this weekend or early next week,
<__martin> and then the final release maybe 10 days later. If one
<__martin> of the above items does not make it in, so be it.
<__martin>    -
<__martin> We have streamlined the process of actually cutting a
<__martin> release (or release candidate) and admins made it possible
<__martin> to completely stay out of this process now. Only one releng
<__martin> member and one security-office member are needed now.
<__martin>    -
<__martin> A release still takes realistically slightly less than 24h
<__martin> wall clock time, the biggest time consumers are all fully
<__martin> automated: 4h build time, 6h network transfer to ftp, 1h
<__martin> generating hashes. Plus various minor manual things like
<__martin> editing the web page and posting the release annoucement.
<__martin>    -
<__martin> We are still processing a huge amount of pullups.
<__martin> This is only possible because developers take the time
<__martin> to test their changes on the branch and submit a
<__martin> pullup request. We have been pretty good with this,
<__martin> and pulled up lots of security and usability
<__martin> improvements, as well as bug fixes to the various
<__martin> active branches. This is good for our users, thank you
<__martin> to everyone who cared and made it possible.
<__martin>    -
<__martin> The following paragraph is (unfortunately) a verbatim
<__martin> copy from last year - and still valid.
<__martin>    -
<__martin> The biggest current issue is the over-aged netbsd-9 branch.
<__martin> We need to get the NetBSD 11 release out ASAP to be
<__martin> able to move NetBSD 9.x out of support.
<__martin>    -
<__martin> After the 11.0 release (and probably the repository switch)
<__martin> I plan to start a discussion about rules and processes,
<__martin> trying to make the time from branching to first release way
<__martin> smaller. A slow release cycle is not that bad overall (IMO)
<__martin> but a year long delay between branching and first release
<__martin> is clearly wrong.
<__martin>    -
<__martin> That is all from release engineering for this year, we are
<__martin> hoping to have a list of several formal releases in next
<__martin> years report and also be close to the final release of
<__martin> 12.0.

<leot> Thank you Martin!
<leot> Next in the agenda we have... Again Martin, but with the security-team@ hat! Feel free to go ahead __martin!

<__martin> This is a brief report for security-team.
<__martin>  -
<__martin> We are: agc billc cherry christos chs cyber hgutch joerg js
<__martin> kre martin maya mrg riastradh rin shm spz
<__martin>  -
<__martin> Since last AGM we have not published any security
<__martin> advisories. We have fixed (and pulled up) one issue that
<__martin> has an SA pending, but it has not been finalized.
<__martin>  -
<__martin> There have been numerous bug fixes applied to the tree, and
<__martin> pulled up to NetBSD-9, NetBSD-10 and NetBSD-11 release
<__martin> branches. We also have updated lots of 3rd party components
<__martin> in the tree when they had new releases fixing security
<__martin> issues. Right now only the expat library needs an update in
<__martin> -current.
<__martin>  -
<__martin> Most security work goes on "behind the scenes" and we
<__martin> usually concur with request of reporters for a specific
<__martin> publication date.
<__martin>  -
<__martin> Where needed we also involve NetBSD developers outside the
<__martin> team when special expertise is needed. While we try to
<__martin> assess all reported issues timely, we sometimes struggle
<__martin> with doing so. Currently we have (if I did not miscount)
<__martin> two open reports that need to be addressed.
<__martin>  -
<__martin> To improve our own process, becoming more reliable and more
<__martin> transparent we are currently applying to become a CNA (CVE
<__martin> number authority). This will allow us to assign and publish
<__martin> our own CVE records. The process forces us to have public
<__martin> statements of response times and processes for issues
<__martin> reported to us. We might need to introduce a ticket system
<__martin> to help with doing timely responses.
<__martin>  -
<__martin> NetBSD continues to be represented in a product security
<__martin> incident response working group with other operating system
<__martin> vendors, as well as a direct contact team with other BSD
<__martin> projects. This framework allows us to work better with
<__martin> vendors requiring an embargoed and/or coordinated release
<__martin> with other operating systems. We can begin working on
<__martin> issues that affect NetBSD much faster, instead of only
<__martin> being notified after an embargo is lifted. We are expanding
<__martin> the number of vendors as time goes on, as well as
<__martin> participating in FIRST.
<__martin>  -
<__martin> This is teaching us quite a bit of where we need to
<__martin> improve our process, which is currently on-going.
<__martin>  -
<__martin> Thanks to everyone helping with security issues!

-!- mode/#netbsd-agm [-v __martin] by leot
<leot> Thank you very much Martin!
<leot> Next in the agenda we have... pkgsrc-pmc presentation, written by <wiz>!
<leot> Unfortunately <wiz> could not attend the AGM so I will present it

<leot> The pkgsrc team kept thousands of packages in pkgsrc up to date and in
<leot> good working order, and delivered four -- the 87th through 90th --
<leot> stable branches. Great work, and thank you to bsiegert@ and maya@ for
<leot> handling the branches!
<leot> .
<leot> The pkgsrc team has welcomed one new developer, kikadf, who takes good
<leot> care of chromium and wayland.
<leot> .
<leot> The current roster is:
<leot> - agc (emeritus member)
<leot> - dholland (board representative)
<leot> - schmonz
<leot> - wiz
<leot> .
<leot> Thank you for working on pkgsrc!!
<leot> -- wiz, for pkgsrc-pmc
<leot> Thanks!

<leot> Next in the agenda... we have pkgsrc-security@ presentation, prepared by <tm>!
<leot> He's only online via a mobile, so I will present it!

<leot> The mission of the pkgsrc Security Team is to ensure that the ever-growing
<leot> ecosystem of third party software is either safe to use or at least be sure
<leot> people are aware of the known vulnerabilities.
<leot>         -
<leot> Our members monitor publicly available vulnerability feeds, mainly CVE.
<leot>         -
<leot> We aggregate received advisories believed to impact pkgsrc into the pkgsrc
<leot> vulnerability list. When time allows we try to notify individual package
<leot> MAINTAINERs and locate, commit patches to fix the vulnerabilities.
<leot>         -
<leot> Since 2021 our ticket handling crew is currently only 2 people, unfortunately
<leot> pretty understaffed. We are looking and welcome people volunteering to join
<leot> us!
<leot>         -
<leot> Currently handling tickets are:
<leot>  - Leonardo Taccari <leot>
<leot>  - Thomas Merkel <tm>
<leot>         -
<leot> The other current members of the team are:
<leot>  - Thomas Klausner <wiz>
<leot>  - Tobias Nygren <tnn>
<leot>  - Tim Zingelman <tez>
<leot>         -
<leot> The year in numbers:
<leot> In 2024, the vulnerability list had 9482 lines added to it (8967 more than last
<leot> year) for a total of 30231 known vulnerabilities.
<leot> In 2025, the ticket queue received 50050 new advisories (9330 more than last
<leot> year). Of these 50050 new advisories:
<leot>  new:        302 ( 0.6%) (not able to handle in 2025)
<leot>  stalled:      0 ( 0.0%)
<leot>  resolved:  1697 ( 3.4%) (affecting pkgsrc packages)
<leot>  rejected: 48051 (96.0%) (no impact or duplicates)
<leot>         -
<leot> Zafer Aydogan <zafer> also joined pkgsrc-security rotation list for several
<leot> months in 2025 and helped us. Thanks Zafer!
<leot>         -
<leot> The current count of vulnerable packages in pkgsrc-current is 787 (138 more
<leot> than last year), in pkgsrc-stable is 809 (144 more than last year).
<leot> See the periodic email to packages@NetBSD.org for the list.
<leot> But we've 3548 vulnerabilities to review!
<leot> We can always use help locating and committing security patches, in particular
<leot> for the many of these that are maintained by pkgsrc-users.
<leot>         -
<leot> We encourage all developers to help us keep the vulnerability list up-to-date.
<leot> If you become aware of a security issue or perform a security update in pkgsrc
<leot> please edit the list. You don't need any special privilege for this.
<leot> You'll find the list in pkgsrc CVS repository:
<leot>  pkgsrc/doc/pkg-vulnerabilities
<leot>         -
<leot> Please join the pkgsrc Security ticket handling crew, we're pretty understaffed
<leot> at the moment! Feel free to get in touch with us for additional details or an
<leot> introduction.
<leot>         -
<leot> EOF

<leot> Thank you very much <tm>!
<leot> We have another presentation that was not in the agenda!
<leot> There is a gnats@ presentation by <dholland>!
-!- mode/#netbsd-agm [+v nbdholland] by leot
<leot> Feel free to go ahead David!

<nbdholland> (This got held up by a schmozzle yesterday. Thanks to riastradh@ for running my dodgy scripts for me.)
<nbdholland>  
<nbdholland> Here's the bug database report since the last AGM (12 months):
<nbdholland>  
<nbdholland> GNATS statistics for 2025 (as of June  6 2026)
<nbdholland>  
<nbdholland> New PRs this year: 880, of which 578 are still open.
<nbdholland> Closed PRs this year: 445. Net change: +435. 
<nbdholland> Total PRs touched this year: 946.
<nbdholland> Oldest PR touched this year: 5514.
<nbdholland> Oldest open PR: 1677; PR ignored for the longest: 4691.
<nbdholland>  
<nbdholland> Total number open: 7313
<nbdholland>  
<nbdholland> (Recall that this isn't github: in NetBSD "PR" means "problem report",
<nbdholland> not "pull request".)
<nbdholland>  
<nbdholland> This is the weekly plot:
<nbdholland>  
<nbdholland>                                                        * 6900
<nbdholland>                                                   ******
<nbdholland>                                               **********
<nbdholland>                                             ************
<nbdholland>                                       ******************
<nbdholland>                                   **********************
<nbdholland>                 ******** *******************************
<nbdholland>              *******************************************
<nbdholland>       **************************************************
<nbdholland>    ***************************************************** 6360
<nbdholland>  
<nbdholland> If anyone was wondering, the oldest open PR (PR 1677) is about a
<nbdholland> panic in unionfs. This is unfortunately still current. The most
<nbdholland> untouched PR (PR 4691) is about ECC memory handling on sun3.
<nbdholland>  
<nbdholland> Unfortunately, we seem to have reverted to our old pattern of an ever-increasing backlog.
<nbdholland>  
<nbdholland> Anyhow, here are the people who've been fixing the most bugs, as
<nbdholland> counted by commit messages found in PRs closed during the year.
<nbdholland>  
<nbdholland>   10  skrll@netbsd.org
<nbdholland>   14  martin@netbsd.org
<nbdholland>   15  bsiegert@netbsd.org
<nbdholland>   18  gutteridge@netbsd.org
<nbdholland>   23  jkoshy@netbsd.org
<nbdholland>   27  kre@netbsd.org
<nbdholland>   29  dmcmahill@netbsd.org
<nbdholland>   50  nia@netbsd.org
<nbdholland>   53  wiz@netbsd.org
<nbdholland>  105  riastradh@netbsd.org
<nbdholland>  
<nbdholland> This list always has a very long tail, and the difference between
<nbdholland> being on it and not is only one commit. This year there were 55 people
<nbdholland> who fixed or helped fix at least one bug report, down a bit from last
<nbdholland> year. Thanks to one and all.
<nbdholland>  
<nbdholland> And here are those who've been processing pullups for bugs, according
<nbdholland> to the same analysis:
<nbdholland>  
<nbdholland>    1  snj@netbsd.org (releng)
<nbdholland>    2  bsiegert@netbsd.org (releng)
<nbdholland>    2  jdc@netbsd.org (releng)
<nbdholland>    4  bouyer@netbsd.org (releng)
<nbdholland>   16  maya@netbsd.org (releng)
<nbdholland>  127  martin@netbsd.org (releng)
<nbdholland>  
<nbdholland> Note that this reflects pullups specifically linked into gnats, not
<nbdholland> all releng work. Nonetheless, it remains heavily skewed. Many, many,
<nbdholland> many thanks, Martin.
<nbdholland>  
<nbdholland> <eot>
<leot> Thank you very much nbdholland!
-!- mode/#netbsd-agm [-v nbdholland] by leot

<leot> Now we can start the Q&A session.
<leot> I have at least 2 questions already in the queue
<leot> If you have more questions, feel free to /msg me with possible <team> / <nick> that may answer the question and I will voice you when it's your turn
-!- mode/#netbsd-agm [+v racoon] by leot
<leot> racoon has some questions for admins@! racoon, feel free to go ahead!
<leot> (admins@ feel free to /msg me if you can answer their questions!)

<racoon> hello netbsd, hello admins
<racoon> my first question is whether it's possible to whitelist netbsd ftp(1) so that it doesn't need to pass a challenge to download files from archive.netbsd.org. my own experience of AI scrapers would say that's an unusual user agent to scrape, but i don't know how heinous they are. i'd like to do e.g. automated fetches of old distfiles
<racoon> *unusual user agent to fake
<Riastradh> racoon: the captcha in there is a temporary workaround, we might deploy anubis or something in the near future
<spz> if ftp has a recognisable user agent that might actually a great idea
<racoon> my second question is whether it's possible that more hardware might be moved to e.g. germany, japan in the future, so that we're less centralized in the US
<Riastradh> racoon: Some of the hardware is in Germany already!
<spz> it's easier for TNF to buy stuff in the US, typically cheaper too. We'll have to think about it.
<Riastradh> (we are already running anubis on https://hgweb.test.netbsd.org and https://gitweb.test.netbsd.org/, just haven't deployed it or anything comparable on other services yet)
<racoon> thank you
<Riastradh> racoon: We would need a rack to do it, with enough machines to make it worthwhile to maintain there.  If you have a rack to offer, we could arrange that!
<leot> Thanks racoon, spz and Riastradh!

-!- mode/#netbsd-agm [-v racoon] by leot
-!- mode/#netbsd-agm [+v cagney_] by leot
<leot> We have a question from cagney_, probably for admins@ / gnats@!
<leot> cagney_, feel free to go ahead with your question(s)!

<cagney_> leot, tks, yes; and hello all
-!- mode/#netbsd-agm [+v nbdholland] by leot
<cagney_> I'm just wondering if, once NetBSD makes it off CVS, if the next big plan is the bug database? Any plans for that?
<Riastradh> heh
<Riastradh> We have had so many grandiose plans for bug database migration I lost count!
<spz> yes, but it's got goats feet and then some
<spz> since we do not want to lose old info
<nbdholland> This has a long and unfortunate history
<Riastradh> So, yes, it would be nice to migrate off gnats but we don't have a plan.
<spz> otherwise: gnats should die. die die die die already. :-P
<nbdholland> and what spz said.
<Riastradh> But maybe we can start planning after we're done with CVS.
<nbdholland> We've already done a lot of planning. The problem has always been getting any real work done on it
<Riastradh> (Actually we won't quite be _done_ with CVS because there'll still be a read-only CVS front end!)
<cagney_> My only experience is that while it matters to preserve old bugs, it matters less to migrate them to a new system.
<cagney_> anyway, looking forward to movement

<leot> Thanks cagney_, spz, Riastradh and nbdholland!
-!- mode/#netbsd-agm [-v cagney_] by leot
<leot> We have another question, from ktnb... probably for security-team@ / core@ I think!
-!- mode/#netbsd-agm [+v ktnb] by leot

<ktnb> Hello!
<ktnb> it seems like there are endless numbers of bugs and security bugs being around daily nowadays. I'm not sure if these bugs are found mostly by AI or not but is there any consideration on how or if we should do audits to find holes in NetBSD? in other words, how do we plan to 'keep up with the times' in this security bug world?
-!- mode/#netbsd-agm [+v __martin] by leot
<leot> If anyone would like to answer and has not been voiced, feel free to /msg me!
<__martin> I'll try to answer that
<__martin> we currently receive real bug reports at still moderate rates
-!- mode/#netbsd-agm [+v krelz] by leot
<__martin> we see a few "spamish" things that first ask for bug bounty programs and then never come back with real issues
-!- mode/#netbsd-agm [+v nbdholland] by leot
<__martin> so right now I'd say it is still handable w/o additional measures
<nbdholland> In the long run we would also like to use formal verification tools to get ahead of the game
<krelz> I didn't mention it in the core report, as I'm not a finance person and didn't
<krelz> want to commit TNF to spend money, but core can receive proposals for projects
<krelz> which can be funded if they seem worthwhile (at moderate rates)
<krelz> If there are any proposals for how we could do active audits of the code,
<krelz> rather than just waiting for someone else to find bugs and tell us about them,
<krelz> that seems like something which might be worthy of some expenditure
<krelz> .
<ktnb> That was kind of my concern: are we not getting a lot of bugs because of lack of usage or are we just _that_ good
<krelz> It is probably some of both of those, much of our codebase is old, and fairly
<krelz> stable, there aren't a lot of bugs (even less security issues) to find probably,
<Riastradh> Perhaps but we shouldn't get cocky...
<krelz> and most of what does exist, is relatively harmless (unlikely, and not catastrophic)
<krelz> But also, our user base isn't all that huge, compared to other systems, so
<krelz> stray bugs can take longer to be encountered.
<krelz> But that's also (in some respects) a good thing, as finding bugs in NetBSD
<krelz> isn't so profitable for hackers, that they are less likely to bother
<krelz> .
-!- mode/#netbsd-agm [+v khorben] by leot
<khorben> I'd like to add and emphasize on a few things: in NetBSD we rely on third-party components
<khorben> some of these components have a security impact and subject to scrutiny (and CVEs)
<khorben> so regardless of our relevance, we are targets too and should fund efforts ourselves
<khorben> as mentioned earlier in board's summary, we are looking for volunteers to help us do that
<khorben> thanks!
<khorben> .
<krelz> Agreed.   Send proposals to core@
<ktnb> Thank folks!

<leot> Thanks ktnb, __martin, nbdholland, krelz, Riastradh and khorben!
-!- mode/#netbsd-agm [-v ktnb] by leot
-!- mode/#netbsd-agm [-v __martin] by leot
-!- mode/#netbsd-agm [-v nbdholland] by leot
-!- mode/#netbsd-agm [-v krelz] by leot
-!- mode/#netbsd-agm [-v khorben] by leot
-!- mode/#netbsd-agm [-v Riastradh] by leot
<leot> We have another question! From Ltning... probably for some pkgsrc folks! (maybe I can answer it, but if you can answer it, feel free to request voice via /msg me)
-!- mode/#netbsd-agm [+v Ltning] by leot

<Ltning> Hey all, I am a first-ish time pkgsrc patch submitted, specifically 60114
<Ltning> It's a simple patch, of which I'd like to contribute more from time to time, but it seems "stuck"
-!- mode/#netbsd-agm [+v racoon] by leot
<Ltning> I guess my questions are 1) What can I do differently to get it unstuck, and 2) is there documentation on not just how to submit patches but how to "chase" them?
<racoon> Ltning: since mail is a non-live medium, and irc is, my main suggestion would be to poke us on irc
<Ltning> (I realise the pkgsrc team, like all others, are understaffed and overworked, so this is not meant to be criticism :)
<racoon> it's also always helpful to say which platforms you've tested on
<racoon> just because it shows confidence in the patch
<Riastradh> Ltning: One thing that would be helpful is to make sure the `make test' target works, in addition to saying what platforms you've tested it on.
<Ltning> Yeah - I have tried a couple times, but I guess not insistently enough. So perhaps the documentation could mention how-to-poke and also these things.
-!- mode/#netbsd-agm [+v nbdholland] by leot
<nbdholland> Another thing is, as per the discussion above, we all dislike gnats, and one of the reasons is that it's very difficult to find things in it
<Ltning> Roger that - thanks. Will follow up with that.
-!- mode/#netbsd-agm [+v krelz] by leot
<nbdholland> So if you file a patch in a gnats PR, and it doesn't get attention quickly, chances are you need to poke someone about it
<krelz> Also remember that everything in netbsd (incl pkgsrc)
<krelz> is done by volunteers - the best way to get someone to look
<Riastradh> .oO(pokage source)
<krelz> at a patch, is to find a developer with similar interests
<krelz> and convince them to take a look - any developer will do,
<Ltning> Yea. I guess the last comment from me then is - this is useful information, and I wish I didn't have to waste your time in this "call" to get it. 
<krelz> the "pkgsrc team" (I believe) are generally more interested in
<Ltning> Don't forget the impostor syndrome - I may be brave enough to poke randomly on IRC, but not everyone will be ..
<krelz> the workings of pkgsrc itself, rather than individual packages
<krelz> Finally, for an upgrade, which your PR is, it really helps to include
<krelz> info on what has changed, why someone would want the upgrade
<krelz> .
<nbdholland> Stuff like this about prodding people about patches being forgotten does appear on the lists at times
<nbdholland> and it's ok to ask procedural questions there

<leot> Thanks Ltning, racoon, nbdholland Riastradh and krelz!
<leot> We have another question, probably for finance-exec@
-!- mode/#netbsd-agm [-v Ltning] by leot

-!- mode/#netbsd-agm [+v Uilebheist] by leot
<Uilebheist> Hi all, Hi NetBSD
<Uilebheist> You mentioned that you are a US IRS 501(c)3 charitable organization - which is great for US people wanting to make a donation, but do you have or plan anything for people elsewhere?
<Riastradh> We have discussed forming a potential nonprofit organization in Europe.
-!- mode/#netbsd-agm [+v khorben] by leot
<Riastradh> The main question is: How much administrative overhead does this bring on us (recall we're pretty much all volunteers, plus some part-time contracts with TNF)?
<Riastradh> And, is that administrative burden worth the additional fundraising it would bring in?
<spz> specifically, there are EU-wide nonprofits, but that's a lot of red tape
<khorben> I can add to that, we have tried to revive an existing NetBSD structure in Germany to help with this
<khorben> unfortunately it hasn't brought fruition as of now
<Riastradh> And the administrative burden is likely to be more than just the sum of the administrative burden of two organizations separately, because they would have to be notionally independent, and we would have to have to come up with a reasonable governance structure for managing the assets.
<khorben> and indeed there are already broader OSS structures in Europe and elsewhere
<khorben> (and what Riastradh says)
<Uilebheist> Thank you.  I guess for now we might just make a slighly smaller donation and not get tax back!
<Riastradh> For example, you may be familiar with the FSF (Free Software Foundation) and FSFE (Free Software Foundation Europe) -- although they are mostly aligned in goals, they are independent organizations with independent governance structure, and sometimes disagree.
<Uilebheist> Ah yes, noticed these.

<leot> Thanks Uilebheist, spz, Riastradh, nbdholland and khorben!
-!- mode/#netbsd-agm [-v Uilebheist] by leot
<leot> I think the questions queue via my /query is currently empty!
<leot> Any other questions?
<leot> (And/or if I've missed any questions, please /msg them again!)
<Cryo> Alright, thanks everyone for coming.
<leot> Cryo: wait!
<leot> We have another question! :)
-!- mode/#netbsd-agm [+v wiedi] by leot

<wiedi> Hi, is there a status update on the repo migration? (Thanks to everyone working on it!)
<Riastradh> We have infrastructure in place, just requiring tying up some loose ends for deployment, and we need to prepare a clean final conversion.
<krelz> Also, there is the test infrastructure, that not enough developers have been using
<Riastradh> The infrastructure has taken a while because we're doing it a little differently from before, so we can reproducibly generate fresh images to test and deploy, rather than manually tinkering with a long-term server installation, and it took some engineering to get the software in shape for that.
<wiedi> Thank you, looking forward to using it :)
<wiedi> does the test infra also have a pkgsrc repo? I forgot... will have a look
<Riastradh> yes, it does
<wiedi> amazing, thanks for your work and answers :)
<leot> Thanks wiedi, Riastradh and krelz!
-!- mode/#netbsd-agm [-v wiedi] by leot
<leot> Any other questions? :)
<Riastradh> There are currently two test deployments, not all aligned on repository data (will change that soon), which you can test as a developer and anonymously.
<Riastradh> Developer access is at hg.test.n.o or git.test.n.o over ssh, and anonymous access is at anonhg.test.n.o or anongit.test.n.o (or https://hgweb.test.netbsd.org/ or https://gitweb.test.netbsd.org).
<krelz> Please, developers, use that, so you can be familiar with
<Riastradh> and there's a test repsitory called testsrc which is small to mess around with
<krelz> how things will work.  Less issues after the real change happens.
<Riastradh> Notes on usage: https://www.netbsd.org/developers/mercurial/ https://www.netbsd.org/developers/git/
<krelz> Nothing can be :bad: in the tests, you can play safely

<Cryo> Alright, again, thanks for coming. We are excited about the roadmap ahead and look forward to achieving these milestones together. Thank you for your time and your dedication to NetBSD.
<Cryo> See you next year!
<leot> Thank you!
<Riastradh> There's also still read-only CVS access via anoncvs.test.n.o (testsrc only for now, will be everything once deployed) for access on small machines where git and hg have trouble running.
<khorben> thanks @all!
 * Cryo turns up the lights
-!- mode/#netbsd-agm [-m] by leot
<Cryo> o/ have a great rest of your day
<leot> You too! Thanks everyone for attending!
-!- spz changed the topic of #netbsd-agm to: The NetBSD Foundation Annual General Meeting - Next Meeting in 2027
<racoon> thanks everyone, especially Riastradh for working on the repo conversion
<d-ra> thanks @all
<Cryo> Thanks to leot and everyone behind the scenes
<Cryo> Thanks to all of the presenters and people who worked on the presentation

GSoC Reports: Make system(3), popen(3) and popenve(3) use posix_spawn(3) internally (Final report)

2021-03-30T11:11:15+00:00

This report was prepared by Nikita Ronja Gillmann as a part of Google Summer of Code 2020

This is my second and final report for the Google Summer of Code project I am working on for NetBSD.

My code can be found at github.com//src in the gsoc2020 branch, at the time of writing some of it is still missing. The test facilities and logs can be found in github.com/nikicoon/gsoc2020. A diff can be found at github which will later be split into several patches before it is sent to QA for merging.

The initial and defined goal of this project was to make system(3) and popen(3) use posix_spawn(3) internally, which had been completed in June. For the second part I was given the task to replace fork+exec calls in our standard shell (sh) in one scenario. Similar to the previous goal we determined through implementation if the initial motivation, to get performance improvements, is correct otherwise we collect metrics for why posix_spawn() in this case should be avoided. This second part meant in practice that I had to add and change code in the kernel, add a new public libc function, and understand shell internals.

Summary of part 1

Prior work: In GSoC 2012 Charles Zhang added the posix_spawn syscall which according to its SF repository at the time (maybe even now, I have not looked very much into comparing all other systems and libcs + kernels) is an in-kernel implementation of posix_spawn which provides performance benefits compared to FreeBSD and other systems which had a userspace implementation (in 2012).

After 1 week of reading POSIX and writing code, 2 weeks of coding and another 1.5 weeks of bugfixes I have successfully implemented posix_spawn in usage in system(3) and popen(3) internally.

The biggest challenge for me was to understand POSIX, to read the standard. I am used to reading more formal books, but I can't remember working with POSIX Standard directly before.

system(3)

system(3) was changed to use posix_spawnattr_ (where we used sigaction before) and posix_spawn (which replaced execve + vfork calls).

popen(3) and popenve(3)

Since the popen and popenve implementation in NetBSD's libc use a couple of shared helper functions, I was able to change both functions while keeping the majority of the changes focused on (some of ) the helper functions (pdes_child).

pdes_child, an internal function in popen.c, now takes one more argument (const char *cmd) for the command to pass to posix_spawn which is called in pdes_child.

On a high level what happens in pdes_child() and popen is that we first lock the pidlist_mutex. Then we create a file file action list for all concurrent popen() / popenve() instances and the side of the pipe not necessary, and the move to stdin/stdout. We unlock the pidlist_mutex. Finally we return the list and destroy.

In the new version of this helper function which now handles the majority of what popen/popenve did, we have to initialize a file_actions object which by default contains no file actions for posix_spawn() to perform. Since we have to have error handling and a common return value for the functions calling pdes_child() and deconstruction, we make use of goto in some parts of this function.

The close() and dup2() actions now get replaced by corresponding file_actions syscalls, they are used to specify a series of actions to be performed by a posix_spawn operation.

After this series of actions, we call _readlockenv(), and call posix_spawn with the file_action object and the other arguments to be executed. If it succeeds, we return the pid of the child to popen, otherwise we return -1, in both cases we destroy the file_action object before we proceed.

In popen and popenve our code has been reduced to the pid == -1 branch, everything else happens in pdes_child() now.

After readlockenv we call pdes_child and pass it the command to execute in the posix_spawn'd child process; if pdes_child returns -1 we run the old error handling code. Likewise for popenve.

The outcome of the first part is, that thanks to how we implement posix_spawn in NetBSD we reduced the syscalls being made for popen and system. A full test with proper timing should indicate this, my reading was based on comparing old and new logs with ktrace and kdump.

sh, posix_spawn actions, libc and kernel - Part 2

Motivation

The main goal of part 2 of this project was to change sh(1) to determine which simple cases of (v)fork + exec I could replace, and to replace them with posix_spawn where it makes sense.

fork needs to create a new address space by cloning the address space, or in the case of vfork update at least some reference counts. posix_spawn can avoid most of this as it creates the new address space from scratch.

Issues

The current posix_spawn as defined in POSIX has no good way to do tcsetpgrp, and we found that fish just avoids posix_spawn for foreground processes.

Implementation

Since, roughly speaking, modern BSDs handle "#!" execution in the kernel (probably since before the 1990s, systems which didn't handle this started to disappear most likely in the mid to late 90s), our main concern so far was in the evalcmd function the default cmd switch case ('NORMALCMD').

After adjusting the function to use posix_spawn, I hit an issue in the execution of the curses application htop where htop would run but input would not be accepted properly (keysequences pressed are visible). In pre-posix_spawn sh, every subprocess that sh (v)forked runs forkchild() to set up the subprocess's environment. With posix_spawn, we need to arrange posix_spawn actions to do the same thing.

The intermediate resolution was to switch FORK_FG processes to fork+exec again. For foreground processes with job control we're in an interactive shell, so the performance benefit is small enough in this case to be negligible. It's really only for shell scripts that it matters.

Next I implemented a posix_spawn file_action, with the prototype

int posix_spawn_file_actions_addtcsetpgrp(posix_spawn_file_actions_t *fa, int fildes)

The kernel part of this was implemented inline in sys/kern/kern_exec.c, in the function handle_posix_spawn_file_actions() for the new case 'FAE_TCSETPGRP'.

The new version of the code is still in testing and debugging phase and at the time of writing not included in my repository (it will be published after Google Summer of Code when I'm done moving).

Future steps

posix_spawnp kernel implementation

According to a conversation with kre@, the posix_spawnp() implementation we have is just itterating over $PATH calling posix_spawn until it succeeds. For some changes we might want a kernel implementation of posix_spawnp(), as the path search is supposed to happen in the kernel so the file actions are only ever run once:


some of the file actions may be "execute once only",
they can't be repeated (eg: handling "set -C; cat foo >file" - file
can only be created once, that has to happen before the exec (as the fd
needs to be made stdout), and then the exec part of posix_spawn is
attempted - if that fails, when it can't find "cat" in $HOME/bin (or
whatever is first in $PATH) and we move along to the next entry (maybe /bin
doesn't really matter) then the repeated file action fails, as file now
exists, and "set -C" demands that we cannot open an already existing file
(noclobber mode).   It would be nice for this if there were "clean up on
failure" actions, but that is likely to be very difficult to get right,
and each would need to be attached to a file action, so only those which
had been performed would result in cleanup attempts.

Replacing all of fork+exec in sh

Ideally we could replace all of (v)fork + exec with posix_spawn. According to my mentors there is pmap synchronisation as an impact of constructing the vm space from scratch with (v)fork. Less IPIs (inter-processor interrupts) matter for small processes too.

posix_spawn_file_action_ioctl

Future directions could involve a posix_spawn action for an arbitrary ioctl.

Thanks

My thanks go to fellow NetBSD developers for answering questions, most recently kre@ for sharing invaluable sh knowledge, Riastradh and Jörg as the mentors I've interacted with most of the time and for their often in-depth explanations as well as allowing me to ask questions I sometimes felt were too obvious. My friends, for sticking up with my "weird" working schedule. Lastly would like to thank the Google Summer of Code program for continuing through the ongoing pandemic and giving students the chance to work on projects full-time.

Hitting donation milestone, financial report for 2020

2021-03-29T09:17:29+00:00

We nearly hit our 2020 donation milestone set after the release of 9.0 of $50,000. These donations have enabled us to fund significant work on NetBSD in 2020 such as:

an aarch64 package build server, victory.netbsd.org. Thanks to Western Washington University for hosting this machine.
Modernizing wi-fi network stack and release engineering work by Martin Husemann
LLDB support by Michał Górny
ptrace and GDB improvements by Kamil Rytarowski

If you are interested in seeing more work like this, please consider donating via PayPal or GitHub sponsors.

The financial report for 2020 is also available.

Note: We realize that this data is inconsistent with the website indicator of donations. This is due to the fact the website is updated manually in an error-prone process as the donations are processed. The financial report (just completed) is prepared separately using ledger.

Google Summer of Code 2020: [Final Report] Enhancing Syzkaller support for NetBSD

2020-10-19T13:20:28+00:00

This report was written by Ayushu Sharma as part of Google Summer of Code 2020.

This post is a follow up of the first report and second report. Post summarizes the work done during the third and final coding period for the Google Summer of Code (GSoc’20) project - Enhance Syzkaller support for NetBSD

Sys2syz

Sys2syz would give an extra edge to Syzkaller for NetBSD. It has a potential of efficiently automating the conversion of syscall definitions to syzkaller’s grammar. This can aid in increasing the number of syscalls covered by Syzkaller significantly with the minimum possibility of manual errors. Let’s delve into its internals.

A peek into Syz2syz Internals

This tool parses the source code of device drivers present in C to a format which is compatible with grammar customized for syzkaller. Here, we try to cull the details of the target device by compiling, and then collocate the details with our python code. For further details about proposed design for the tool, refer to previous post.

Python code follows 4 major steps:

Extractor.py - Extraction of all ioctl commands of a given device driver along with their arguments from the header files.
Bear.py - Preprocessing of the device driver's files using compile_commands.json generated during the setup of tool using Bear.
C2xml.py - XML files are generated by running c2xml on preprocessed device files. This eases the process of fetching the information related to arguments of commands
Description.py - Generates descriptions for the ioctl commands and their arguments (builtin-types, arrays, pointers, structures and unions) using the XML files.

Extraction:

This step involves fetching the possible ioctl commands for the target device driver and getting the files which have to be included in our dev_target.txt file. We have already seen all the commands for device drivers are defined in a specific way. These commands defined in the header files need to be grepped along with the major details, regex comes in as a rescue for this


	io = re.compile("#define\s+(.*)\s+_IO\((.*)\).*")
	iow = re.compile("#define\s+(.*)\s+_IOW\((.*),\s+(.*),\s+(.*)\).*")
	ior = re.compile("#define\s+(.*)\s+_IOR\((.*),\s+(.*),\s+(.*)\).*")
	iowr = re.compile("#define\s+(.*)\s+_IOWR\((.*),\s+(.*),\s+(.*)\).*")

Code scans through all the header files present in the target device folder and extracts all the commands along with their details using compiled regex expressions. Details include the direction of buffer(null, in, out, inout) based on the types of Ioctl calls(_IO, _IOR, _IOW, _IOWR) and the argument of the call. These are stored in a file named ioctl_commands.txt at location out/<target_name>. Example output:


out, I2C_IOCTL_EXEC, i2c_ioctl_exec_t

Preprocessing:

Preprocessing is required for getting XML files, about which we would look in the next step. Bear plays a major role when it comes to preprocessing C files. It records the commands executed for building the target device driver. This step is performed when setup.sh script is executed.

Extracted commands are modified with the help of parse_commands() function to include ‘-E’ and ‘-fdirectives’ flags and give it a new output location. Commands extracted by this function are then used by the compile_target function which filters out the unnecessary flags and generates preprocessed files in our output directory.

Generating XML files

Run C2xml on the preprocessed files to fetch XML files which stores source code in a tree-like structure, making it easier to collect all the information related to each and every element of structures, unions etc. For eg:


	<symbol type="struct" id="_5970" file="am2315.i" start-line="13240" start-col="16" end-line="13244" end-col="11" bit-size="96" alignment="4" offset="0">
		<symbol type="node" id="_5971" ident="ipending" file="am2315.i" start-line="13241" start-col="33" end-line="13241" end-col="41" bit-size="32" alignment="4" offset="0" base-type-builtin="unsigned int"/<
		<symbol type="node" id="_5972" ident="ilevel" file="am2315.i" start-line="13242" start-col="33" end-line="13242" end-col="39" bit-size="32" alignment="4" offset="4" base-type-builtin="int"/>
		<symbol type="node" id="_5973" ident="imasked" file="am2315.i" start-line="13243" start-col="33" end-line="13243" end-col="40" bit-size="32" alignment="4" offset="8" base-type-builtin="unsigned int"/>
	</symbol>
	<symbol type="pointer" id="_5976" file="am2315.i" start-line="13249" start-col="14" end-line="13249" end-col="25" bit-size="64" alignment="8" offset="0" base-type-builtin="void"/>
	<symbol type="array" id="_5978" file="am2315.i" start-line="13250" start-col="33" end-line="13250" end-col="39" bit-size="288" alignment="4" offset="0" base-type-builtin="unsigned int" array-size="9"/>

We would further see how attributes like - idents, id, type, base-type-builtin etc conveniently helps us to analyze code and generate descriptions in a trouble-free manner .

Descriptions.py

Final part, which offers a txt file storing all the required descriptions as its output. Here, information from the xml files and ioctl_commands.txt are combined together to generate descriptions of ioctl commands and their arguments.

Xml files for the given target device are parsed to form trees,


for file in (os.listdir(self.target)):
	tree = ET.parse(self.target+file)
	self.trees.append(tree)

We then traverse through these trees to search for the arguments of a particular ioctl command (particularly _IOR, _IOW, _IOWR commands) by the name of the argument. Once an element with the same value for ident attribute is found, attributes of the element are further examined to get its type. Possible types for these arguments are - struct, union, enum, function, array, pointer, macro and node. Using the type information we determine the way to define the element in accordance with syzkaller’s grammar syntax.

Building structs and unions involves defining their elements too, XML makes it easier. Program analyses each and every element which is a child of the root (struct/union) and generates its definitions. A dictionary helps in tracking the structs/unions which have been already built. Later, the dictionary is used to pretty print all the structs and union in the output file. Here is a code snippet which depicts the approach


            name = child.get("ident")
            if name not in self.structs_and_unions.keys():
                elements = {}
                for element in child:
                    elem_type = self.get_type(element)
                    elem_ident = element.get("ident")
                    if elem_type == None:
                        elem_type = element.get("type") 
                    elements[element.get("ident")] = elem_type

                element_str = ""
                for element in elements: 
                    element_str += element + "\t" + elements[element] + "\n"
                self.structs_and_unions[name] = " {\n" + element_str + "}\n"
            return str(name)

Task of creating descriptions for arrays is made simpler due to the attribute - `array-size`. When it comes to dealing with pointers, syzkaller needs the user to fill in the direction of the pointer. This has already been taken care of while analyzing the ioctl commands in Extractor.py. The second argument with in/out/inout as its possible value depends on ‘fun’ macros - _IOR, _IOW, _IOWR respectively.

There is another category named as nodes which can be distinguished using the base-type-builtin and base-type attributes.

Result

Once the setup script for sys2syz is executed, sys2syz can be used for a certain target_device file by executing the python wrapper script (sys2syz.py) with :

#bin/sh
python sys2syz.py -t <absolute_path_to_device_driver_source> -c compile_commands.json -v

This would generate a dev_<device_driver>.txt file in the out directory. An example description file autogenerated by sys2syz for i2c device driver.


#Autogenerated by sys2syz
include 

resource fd_i2c[fd]

syz_open_dev$I2C(dev ptr[in, string["/dev/i2c"]], id intptr, flags flags[open_flags]) fd_i2c

ioctl$I2C_IOCTL_EXEC(fd fd_i2c, cmd const[I2C_IOCTL_EXEC], arg ptr[out, i2c_ioctl_exec])

i2c_ioctl_exec {
iie_op	flags[i2c_op_t_flags]
iie_addr	int16
iie_buflen	len[iie_buf, intptr]
iie_buf	buffer[out]
iie_cmdlen	len[iie_cmd, intptr]
iie_cmd	buffer[out]
}

Future Work

Though we have a basic working structure of this tool, yet a lot has to be worked upon for leveling it up to make the best of it. Perfect goals would be met when there would be least of manual labor needed. Sys2syz still looks forward to automating the detection of macros used by the flag types in syzkaller. List of to-dos also includes extending syzkaller’s support for generation of description of syscalls.

Some other yet-to-be-done tasks include-

Generating descriptions for function type
Calculating attributes for structs and unions

Summary

We have surely reached closer to our goals but the project needs active involvement and incremental updates to scale it up to its full potential. Looking forward to much more learning and making more contribution to NetBSD community.

Atlast, a word of thanks to my mentors William Coldwell, Siddharth Muralee, Santhosh Raju and Kamil Rytarowski as well as the NetBSD organization for being extremely supportive. Also, I owe a big thanks to Google for giving me such a glaring opportunity to work on this project.

The GNU GDB Debugger and NetBSD (Part 5)

2020-10-07T17:16:53+00:00

The NetBSD developers maintain two copies of GDB:

One in the base-system that includes a significant set of local patches.
Another one in pkgsrc whose patching is limited to mostly build fixes.

The base-system version of GDB (GPLv3) still relies on local patching to work. I have set a goal to reduce the number of custom patches to bare minimum, ideally achieving the state of GDB working without any local modifications at all.

GDB changes

Last month, the NetBSD/amd64 support was merged into gdbserver. This month, the gdbserver target support was extended to NetBSD/i386 and NetBSD/aarch64. The gdbserver and gdb code was cleaned up, refactored and made capable of introducing even more NetBSD targets.

Meanwhile, the NetBSD/i386 build of GDB was fixed. The missing include of x86-bsd-nat.h as a common header was added to i386-bsd-nat.h. The i386 GDB code for BSD contained a runtime assert that verified whether the locally hardcoded struct sigcontext is compatible with the system headers. In reality, the system headers are no longer using this structure since 2003, after the switch to ucontext_t, and the validating code was no longer effective. After the switch to newer GCC, this was reported as a unused local variable by the compiler. I have decided to remove the check on NetBSD entirely. This was followed up by a small build fix.

The NetBSD team has noticed that the GDB's agent.cc code contains a portability bug and prepared a local fix. The traditional behavior of the BSD kernel is that passing random values of sun_len (part of sockaddr_un) can cause failures. In order to prevent the problems, the sockaddr_un structure is now zeroed before use. I've reimplemented the fix and successfully upstreamed it.

In order to easily resolve the issue with environment hardening enforced by PaX MPROTECT, I've introduced a runtime warning whenever byte transfers betweeen the debugee and debugger occur with the EACCES errno code.

binutils changes

I've added support for NetBSD/aarch64 upstream, in GNU BFD and GNU GAS. NetBSD still carries local patches for the GNU binutils components, and GNU ld does not build out of the box on NetBSD/aarch64.

Summary

The NetBSD support in GNU binutils and GDB is improving promptly, and the most popular platforms of amd64, i386 and aarch64 are getting proper support out of the box, without downstream patches. The remaining patches for these CPUs include: streamlining kgdb support, adding native GDB support for aarch64, upstreaming local modifications from the GNU binutils components (especially BFD and ld) and introducing portability enhancements in the dependent projects like libiberty and gnulib. Then, the remaining work is to streamline support for the remaining CPUs (Alpha, VAX, MIPS, HPPA, IA64, SH3, PPC, etc.), to develop the missing generic features (such as listing open file descriptors for the specified process) and to fix failures in the regression test-suite.

Google Summer of Code 2020: [Final Report] RumpKernel Syscall Fuzzing

2020-09-25T21:53:00+00:00

This report was prepared by Aditya Vardhan Padala as a part of Google Summer of Code 2020

This post is the third update to the project RumpKernel Syscall Fuzzing.

Part1 - https://blog.netbsd.org/tnf/entry/gsoc_reports_fuzzing_rumpkernel_syscalls1

Part2 - https://blog.netbsd.org/tnf/entry/gsoc_reports_fuzzing_rumpkernel_syscalls

The first and second coding period was entirely dedicated to fuzzing rumpkernel syscalls using hongfuzz. Initially a dumb fuzzer was developed to start fuzzing but it soon reached its limits.

For the duration of second coding peroid we concentrated on crash reproduction and adding grammar to the fuzzer which yielded in better results as we tested on a bug in ioctl with grammar. Although this works for now crash reproduction needs to be improved to generate a working c reproducer.

For the last coding period I have looked into the internals of syzkaller to understand how it pregenerates input and how it mutates data. I have continued to work on integrating buildrump.sh with build.sh. buildrump eases the task fo building the rumpkernel on any host for any target.

buildrump.sh is like a wrapper around build.sh to build the tools and rumpkernel from the source relevant to rumpkernel. So I worked to get buildrump.sh working with netbsd-src. Building the toolchain was successfull from netbsd-src. So binaries like rumpmake work just fine to continue building the rumpkernel.

But the rumpkernel failed to build due to some warnings and errors similar to the following. It can be due to the fact that buildrump.sh has been dormant recently I faced a lot of build issues.

nbmake[2]: nbmake[2]: don't know how to make /root/buildrump.sh/obj/dest.stage/usr/lib/crti.o. Stop

nbmake[2]: stopped in /root/buildrump.sh/src/lib/librumpuser
>> ERROR:
>> make /root/buildrump.sh/obj/Makefile.first dependall

Few of the similar errors were easily fixed but I couldn't integrate it during the time span of the coding period.

To Do

Research more on grammar definition and look into the existing grammar fuzzers for a better understanding of generating grammar.
Integrate syz2sys with the existing fuzzer to include grammar generation for better results.

GSoC with NetBSD has been an amazing journey throughout, in which I had a chance to learn from awesome people and work on amazing projects. I will continue to work on the project to achieve the goal of integrating my fuzzer with OSS Fuzz. I thank my mentors Siddharth Muralee, Maciej Grochowski, Christos Zoulas for their support and Kamil for his continuous guidance.

Google Summer of Code 2020: [Final Report] Curses Library Automated Testing

2020-09-25T21:50:01+00:00

This report was prepared by Naman Jain as a part of Google Summer of Code 2020

My GSoC project under NetBSD involves the development of the test framework of curses. This is the final blog report in a series of blog reports; you can look at the first report and second report of the series.

The first report gives a brief introduction of the project and some insights into the curses testframe through its architecture and language. To someone who wants to contribute to the test suite, this blog can act as the quick guide of how things work internally. Meanwhile, the second report discusses some of the concepts that were quite challenging for me to understand. I wanted to share them with those who may face such a challenge. Both of these reports also cover the progress made in various phases of the Summer of Code.

This being the final report in the series, I would love to share my experience throughout the project. I would be sharing some of the learning as well as caveats that I faced in the project.

Challenges and Caveats

By the time my application for GSoC was submitted, I had gained some knowledge about the curses library and the testing framework. Combined with compiler design and library testing experience, that knowledge proved useful but not sufficient as I progressed through the project. There were times when, while writing a test case, you have to look into documentation from various sources, be it NetBSD, FreeBSD, Linux, Solaris, etc. One may find questioning his understanding of the framework, documentation, or even curses itself. This leads to the conclusion that for being a tester, one has to become a user first. That made me write minimal programs to understand the behavior. The experience was excellent, and I felt amazed by the capability and complexity of curses.

Learnings

The foremost learning is from the experience of interacting with the open-source community and feeling confident in my abilities to contribute. Understanding the workflows; following the best practices like considering the maintainability, readability, and simplicity of the code were significant learning.

The project-specific learning was not limited to test-framework but a deeper understanding of curses as I have to browse through codes for the functions tested. As this blog says, getting the TTY demystified was a long-time desire, which got fulfilled to some extent.

Some tests from test suite

In this section, I would discuss a couple of tests of the test suite written during the third phase of GSoC. Curses input model provides a variety of ways to obtain input from keyboard. We will consider 2 tests keypad and halfdelay that belong to input processing category but are somewhat unrelated.

Keypad Processing

An application can enable or disable the tarnslation of keypad using keypad() function. When translation is enabled, curses attempts to translate input sequence into a single key code. If disabled, curses passes the input as it is and any interpretation has to be made by application.

include window
call $FALSE is_keypad $win1
input "\eOA"
call 0x1b wgetch $win1
call OK keypad $win1 $TRUE
input "\eOA"
call $KEY_UP wgetch $win1

# disable assembly of KEY_UP
call OK keyok $KEY_UP $FALSE
input "\eOA"
call 0x1b wgetch $win1

As keypad translation is disabled by default, on input of '\eOA', the input sequence is passed as it is and only '\e' (0x1b is hex code) is received on wgetch(). If we enable the translation, then the same input is translated as KEY_UP. In curses, one can disable assembly of specific key symbols using keyok(). See related man page.

Input Mode

Curses lets the application control the effect of input using four input modes; cooked, cbreak, half-delay, raw. They specify the effect of input in terms of echo-ing and delay. We will discuss about the halfdelay mode. The half-delay mode specifies how quickly certain curses function return to application when there is no terminal input waiting since the function is called.

include start
delay 1000
# input delay 1000 equals to 10 tenths of seconds
# getch must fail for halfdelay(5) and pass for halfdelay(15)
input "a"
call OK halfdelay 15
call 0x61 getch
call OK halfdelay 5
input "a"
call -1 getch

We have set the delay for feeding input to terminal with delay of 1s(10 tenths of second). If the application sets the halfdelay to 15, and makes a call to getch() it receives the input. But it fails to get the input with haldelay set to 5. See related man page.

Project Work

The work can be merged into organisation repository https://github.com/NetBSD/src under tests/lib/libcurses.

This project involved:

Improvement in testframework:
- Automation of the checkfile generation.
- Enhnacement of support for complex character
- Addition of small features and code refactoring
Testing and bug reports:
- Tests for a family of routines like wide character, complex character, line drawing, box drawing, pad, window operations, cursor manipulations, soft label keys, input-output stream, and the ones involving their interactions.
- Raising a bunch of Problem Report (PR) under lib category some of which have been fixed. The list of PRs raised can be found here

Future Work

The current testframe supports complex character, but the support needs to be extended for its string. This will enable testing of [mv][w]add_wch[n]str, [mv][w]in_wchstr family of routines.
Some of the tests for teminal manipulation routines like intrflush, def_prog_mode, typeahead, raw, etc. are not there in test suite.
Not specifically related to the framework, but the documentation for wide character as well as complex character routines need to be added.

Acknowledgements

I want to extend my heartfelt gratitude to my mentor Mr. Brett Lymn, who helped me through all the technical difficulties and challenges I faced. I also thank my mentor Martin Huseman for valuable suggestions and guidance at various junctures of the project. A special thanks to Kamil Rytarowski for making my blogs published on the NetBSD site.

The GNU GDB Debugger and NetBSD (Part 4)

2020-09-10T21:13:01+00:00

The NetBSD team of developers maintains two copies of GDB:

One in the base-system with a stack of local patches.
One in pkgsrc with mostly build fix patches.

The base-system version of GDB (GPLv3) still relies on a set of local patches. I set a goal to reduce the local patches to bare minimum, ideally reaching no local modifications at all.

GDB changes

Over the past month I worked on gdbserver for NetBSD/amd64 and finally upstreamed it to the GDB mainline, just in time for GDB 10.

What is gdbserver? Let's quote the official GDB documentation:

gdbserver is a control program for Unix-like systems, which allows you to connect your program with a remote GDB via target remote or target extended-but without linking in the usual debugging stub.

gdbserver is not a complete replacement for the debugging stubs, because it requires essentially the same operating-system facilities that GDB itself does. In fact, a system that can run gdbserver to connect to a remote GDB could also run GDB locally! gdbserver is sometimes useful nevertheless, because it is a much smaller program than GDB itself. It is also easier to port than all of GDB, so you may be able to get started more quickly on a new system by using gdbserver. Finally, if you develop code for real-time systems, you may find that the tradeoffs involved in real-time operation make it more convenient to do as much development work as possible on another system, for example by cross-compiling. You can use gdbserver to make a similar choice for debugging.

GDB and gdbserver communicate via either a serial line or a TCP connection, using the standard GDB remote serial protocol. remote

This illustrated that gdbserver is especially useful for debugging applications on embedded and thin devices, connected to a controlling computer equipped with full distribution sources, toolchain, debugging information etc. Eventually, this approach of gdb and gdbserver can replace the native gdb plugin entirely and spawn all connections debugging sessions using this protocol. This design decision was already introduced into LLDB, where remote process plugin is the only supported program on Linux, NetBSD and highly recommended for other kernels.

I've picked amd64 as the first target as it's the easiest to develop and test.

An example debugging session looks like this:

$ uname -rms
NetBSD 9.99.72 amd64
$ LC_ALL=C date
Thu Sep 10 22:43:10 CEST 2020
$ ./gdbserver/gdbserver --version                
GNU gdbserver (GDB) 10.0.50.20200910-git
Copyright (C) 2020 Free Software Foundation, Inc.
gdbserver is free software, covered by the GNU General Public License.
This gdbserver was configured as "x86_64-unknown-netbsd9.99"
$ ./gdbserver/gdbserver localhost:1234 /usr/bin/nslookup
Process /usr/bin/nslookup created; pid = 26383
Listening on port 1234

Then on the other terminal:

$ ./gdb/gdb
GNU gdb (GDB) 10.0.50.20200910-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-netbsd9.99".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
--Type for more, q to quit, c to continue without paging--
.

For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
Reading /usr/bin/nslookup from remote target...
warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
Reading /usr/bin/nslookup from remote target...
Reading symbols from target:/usr/bin/nslookup...
Reading /usr/bin/nslookup.debug from remote target...
Reading /usr/bin/.debug/nslookup.debug from remote target...
Reading /usr/libdata/debug//usr/bin/nslookup.debug from remote target...
Reading /usr/libdata/debug//usr/bin/nslookup.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/bin/nslookup.debug...
process 28353 is executing new program: /usr/bin/nslookup
Reading /usr/bin/nslookup from remote target...
Reading /usr/bin/nslookup from remote target...
Reading /usr/bin/nslookup.debug from remote target...
Reading /usr/bin/.debug/nslookup.debug from remote target...
Reading /usr/libdata/debug//usr/bin/nslookup.debug from remote target...
Reading /usr/libdata/debug//usr/bin/nslookup.debug from remote target...
Reading /usr/libexec/ld.elf_so from remote target...
Reading /usr/libexec/ld.elf_so from remote target...
Reading /usr/libexec/ld.elf_so.debug from remote target...
Reading /usr/libexec/.debug/ld.elf_so.debug from remote target...
Reading /usr/libdata/debug//usr/libexec/ld.elf_so.debug from remote target...
Reading /usr/libdata/debug//usr/libexec/ld.elf_so.debug from remote target...
warning: Invalid remote reply: timeout [kamil: repeated multiple times...]
Reading /usr/lib/libbind9.so.15 from remote target...
Reading /usr/lib/libisccfg.so.15 from remote target...
Reading /usr/lib/libdns.so.15 from remote target...
Reading /usr/lib/libns.so.15 from remote target...
Reading /usr/lib/libirs.so.15 from remote target...
Reading /usr/lib/libisccc.so.15 from remote target...
Reading /usr/lib/libisc.so.15 from remote target...
Reading /usr/lib/libkvm.so.6 from remote target...
Reading /usr/lib/libz.so.1 from remote target...
Reading /usr/lib/libblocklist.so.0 from remote target...
Reading /usr/lib/libpthread.so.1 from remote target...
Reading /usr/lib/libpthread.so.1.4.debug from remote target...
Reading /usr/lib/.debug/libpthread.so.1.4.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libpthread.so.1.4.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libpthread.so.1.4.debug from remote target...
Reading /usr/lib/libgssapi.so.11 from remote target...
Reading /usr/lib/libheimntlm.so.5 from remote target...
Reading /usr/lib/libkrb5.so.27 from remote target...
Reading /usr/lib/libcom_err.so.8 from remote target...
Reading /usr/lib/libhx509.so.6 from remote target...
Reading /usr/lib/libcrypto.so.14 from remote target...
Reading /usr/lib/libasn1.so.10 from remote target...
Reading /usr/lib/libwind.so.1 from remote target...
Reading /usr/lib/libheimbase.so.2 from remote target...
Reading /usr/lib/libroken.so.20 from remote target...
Reading /usr/lib/libsqlite3.so.1 from remote target...
Reading /usr/lib/libcrypt.so.1 from remote target...
Reading /usr/lib/libutil.so.7 from remote target...
Reading /usr/lib/libedit.so.3 from remote target...
Reading /usr/lib/libterminfo.so.2 from remote target...
Reading /usr/lib/libc.so.12 from remote target...
Reading /usr/lib/libgcc_s.so.1 from remote target...
Reading symbols from target:/usr/lib/libbind9.so.15...
Reading /usr/lib/libbind9.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libbind9.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libbind9.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libbind9.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libbind9.so.15.0.debug...
Reading symbols from target:/usr/lib/libisccfg.so.15...
Reading /usr/lib/libisccfg.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libisccfg.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisccfg.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisccfg.so.15.0.debug from remote target...
--Type for more, q to quit, c to continue without paging--
Reading symbols from target:/usr/libdata/debug//usr/lib/libisccfg.so.15.0.debug...
Reading symbols from target:/usr/lib/libdns.so.15...
Reading /usr/lib/libdns.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libdns.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libdns.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libdns.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libdns.so.15.0.debug...
Reading symbols from target:/usr/lib/libns.so.15...
Reading /usr/lib/libns.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libns.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libns.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libns.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libns.so.15.0.debug...
Reading symbols from target:/usr/lib/libirs.so.15...
Reading /usr/lib/libirs.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libirs.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libirs.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libirs.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libirs.so.15.0.debug...
Reading symbols from target:/usr/lib/libisccc.so.15...
Reading /usr/lib/libisccc.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libisccc.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisccc.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisccc.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libisccc.so.15.0.debug...
Reading symbols from target:/usr/lib/libisc.so.15...
Reading /usr/lib/libisc.so.15.0.debug from remote target...
Reading /usr/lib/.debug/libisc.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisc.so.15.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libisc.so.15.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libisc.so.15.0.debug...
Reading symbols from target:/usr/lib/libkvm.so.6...
Reading /usr/lib/libkvm.so.6.0.debug from remote target...
Reading /usr/lib/.debug/libkvm.so.6.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libkvm.so.6.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libkvm.so.6.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libkvm.so.6.0.debug...
Reading symbols from target:/usr/lib/libz.so.1...
Reading /usr/lib/libz.so.1.0.debug from remote target...
Reading /usr/lib/.debug/libz.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libz.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libz.so.1.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libz.so.1.0.debug...
Reading symbols from target:/usr/lib/libblocklist.so.0...
Reading /usr/lib/libblocklist.so.0.0.debug from remote target...
Reading /usr/lib/.debug/libblocklist.so.0.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libblocklist.so.0.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libblocklist.so.0.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libblocklist.so.0.0.debug...
Reading symbols from target:/usr/lib/libgssapi.so.11...
Reading /usr/lib/libgssapi.so.11.0.debug from remote target...
Reading /usr/lib/.debug/libgssapi.so.11.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libgssapi.so.11.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libgssapi.so.11.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libgssapi.so.11.0.debug...
Reading symbols from target:/usr/lib/libheimntlm.so.5...
Reading /usr/lib/libheimntlm.so.5.0.debug from remote target...
Reading /usr/lib/.debug/libheimntlm.so.5.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libheimntlm.so.5.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libheimntlm.so.5.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libheimntlm.so.5.0.debug...
Reading symbols from target:/usr/lib/libkrb5.so.27...
Reading /usr/lib/libkrb5.so.27.0.debug from remote target...
Reading /usr/lib/.debug/libkrb5.so.27.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libkrb5.so.27.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libkrb5.so.27.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libkrb5.so.27.0.debug...
Reading symbols from target:/usr/lib/libcom_err.so.8...
Reading /usr/lib/libcom_err.so.8.0.debug from remote target...
Reading /usr/lib/.debug/libcom_err.so.8.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcom_err.so.8.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcom_err.so.8.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libcom_err.so.8.0.debug...
Reading symbols from target:/usr/lib/libhx509.so.6...
Reading /usr/lib/libhx509.so.6.0.debug from remote target...
Reading /usr/lib/.debug/libhx509.so.6.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libhx509.so.6.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libhx509.so.6.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libhx509.so.6.0.debug...
Reading symbols from target:/usr/lib/libcrypto.so.14...
Reading /usr/lib/libcrypto.so.14.0.debug from remote target...
Reading /usr/lib/.debug/libcrypto.so.14.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcrypto.so.14.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcrypto.so.14.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libcrypto.so.14.0.debug...
Reading symbols from target:/usr/lib/libasn1.so.10...
Reading /usr/lib/libasn1.so.10.0.debug from remote target...
Reading /usr/lib/.debug/libasn1.so.10.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libasn1.so.10.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libasn1.so.10.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libasn1.so.10.0.debug...
Reading symbols from target:/usr/lib/libwind.so.1...
Reading /usr/lib/libwind.so.1.0.debug from remote target...
Reading /usr/lib/.debug/libwind.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libwind.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libwind.so.1.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libwind.so.1.0.debug...
Reading symbols from target:/usr/lib/libheimbase.so.2...
Reading /usr/lib/libheimbase.so.2.0.debug from remote target...
Reading /usr/lib/.debug/libheimbase.so.2.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libheimbase.so.2.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libheimbase.so.2.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libheimbase.so.2.0.debug...
Reading symbols from target:/usr/lib/libroken.so.20...
Reading /usr/lib/libroken.so.20.0.debug from remote target...
Reading /usr/lib/.debug/libroken.so.20.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libroken.so.20.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libroken.so.20.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libroken.so.20.0.debug...
Reading symbols from target:/usr/lib/libsqlite3.so.1...
Reading /usr/lib/libsqlite3.so.1.4.debug from remote target...
Reading /usr/lib/.debug/libsqlite3.so.1.4.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libsqlite3.so.1.4.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libsqlite3.so.1.4.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libsqlite3.so.1.4.debug...
Reading symbols from target:/usr/lib/libcrypt.so.1...
Reading /usr/lib/libcrypt.so.1.0.debug from remote target...
Reading /usr/lib/.debug/libcrypt.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcrypt.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libcrypt.so.1.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libcrypt.so.1.0.debug...
Reading symbols from target:/usr/lib/libutil.so.7...
Reading /usr/lib/libutil.so.7.24.debug from remote target...
Reading /usr/lib/.debug/libutil.so.7.24.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libutil.so.7.24.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libutil.so.7.24.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libutil.so.7.24.debug...
Reading symbols from target:/usr/lib/libedit.so.3...
Reading /usr/lib/libedit.so.3.1.debug from remote target...
Reading /usr/lib/.debug/libedit.so.3.1.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libedit.so.3.1.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libedit.so.3.1.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libedit.so.3.1.debug...
Reading symbols from target:/usr/lib/libterminfo.so.2...
Reading /usr/lib/libterminfo.so.2.0.debug from remote target...
Reading /usr/lib/.debug/libterminfo.so.2.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libterminfo.so.2.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libterminfo.so.2.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libterminfo.so.2.0.debug...
Reading symbols from target:/usr/lib/libc.so.12...
Reading /usr/lib/libc.so.12.217.debug from remote target...
Reading /usr/lib/.debug/libc.so.12.217.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libc.so.12.217.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libc.so.12.217.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libc.so.12.217.debug...
Reading symbols from target:/usr/lib/libgcc_s.so.1...
Reading /usr/lib/libgcc_s.so.1.0.debug from remote target...
Reading /usr/lib/.debug/libgcc_s.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libgcc_s.so.1.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/libgcc_s.so.1.0.debug from remote target...
Reading symbols from target:/usr/libdata/debug//usr/lib/libgcc_s.so.1.0.debug...
Reading /usr/libexec/ld.elf_so from remote target...
_rtld_debug_state () at /usr/src/libexec/ld.elf_so/rtld.c:1577
1577 __insn_barrier();
(gdb) b main
Breakpoint 1 at 0x211c00: file /usr/src/external/mpl/bind/bin/nslookup/../../dist/bin/dig/nslookup.c, line 990.
(gdb) c
Continuing.

Breakpoint 1, main (argc=1, argv=0x7f7fffffe768)
at /usr/src/external/mpl/bind/bin/nslookup/../../dist/bin/dig/nslookup.c:990
990 main(int argc, char **argv) {
(gdb) bt
#0 main (argc=1, argv=0x7f7fffffe768)
at /usr/src/external/mpl/bind/bin/nslookup/../../dist/bin/dig/nslookup.c:990
(gdb) info threads
Id Target Id Frame
* 1 Thread 28353.28353 main (argc=1, argv=0x7f7fffffe768)
at /usr/src/external/mpl/bind/bin/nslookup/../../dist/bin/dig/nslookup.c:990
(gdb) b pthread_setname_np
Breakpoint 2 at 0x7f7ff4e0c9e4: file /usr/src/lib/libpthread/pthread.c, line 792.
(gdb) c
Continuing.
[New Thread 28353.27773]

Thread 1 hit Breakpoint 2, pthread_setname_np (thread=0x7f7ff7e41000,
name=name@entry=0x7f7fffffe610 "work-0", arg=arg@entry=0x0)
at /usr/src/lib/libpthread/pthread.c:792
792 {
(gdb) info threads
Id Target Id Frame
* 1 Thread 28353.28353 pthread_setname_np (thread=0x7f7ff7e41000,
name=name@entry=0x7f7fffffe610 "work-0", arg=arg@entry=0x0)
at /usr/src/lib/libpthread/pthread.c:792
2 Thread 28353.27773 0x00007f7ff0aa623a in ___lwp_park60 () from target:/usr/lib/libc.so.12
(gdb) n
796 pthread__error(EINVAL, "Invalid thread",
(gdb) n
799 if (pthread__find(thread) != 0)
(gdb)
802 namelen = snprintf(newname, sizeof(newname), name, arg);
(gdb)
803 if (namelen >= PTHREAD_MAX_NAMELEN_NP)
(gdb)
806 cp = strdup(newname);
(gdb)
807 if (cp == NULL)
(gdb)
810 pthread_mutex_lock(&thread->pt_lock);
(gdb)
811 oldname = thread->pt_name;
(gdb)
812 thread->pt_name = cp;
(gdb)
813 (void)_lwp_setname(thread->pt_lid, cp);
(gdb)
814 pthread_mutex_unlock(&thread->pt_lock);
(gdb) n
816 if (oldname != NULL)
(gdb) n
isc_taskmgr_create (mctx=, workers=workers@entry=1, default_quantum=,
default_quantum@entry=0, nm=nm@entry=0x0, managerp=managerp@entry=0x418638 )
at /usr/src/external/mpl/bind/lib/libisc/../../dist/lib/isc/task.c:1431
1431 for (i = 0; i < workers; i++) {
(gdb) info threads
Id Target Id Frame
* 1 Thread 28353.28353 isc_taskmgr_create (mctx=, workers=workers@entry=1,
default_quantum=, default_quantum@entry=0, nm=nm@entry=0x0,
managerp=managerp@entry=0x418638 )
at /usr/src/external/mpl/bind/lib/libisc/../../dist/lib/isc/task.c:1431
2 Thread 28353.27773 "work-0" 0x00007f7ff0aa623a in ___lwp_park60 ()
from target:/usr/lib/libc.so.12
(gdb) dis 1
(gdb) b exit
Breakpoint 3 at 0x7f7ff0b530e0: exit. (2 locations)
(gdb) c
Continuing.

Thread 1 hit Breakpoint 2, pthread_setname_np (thread=0x7f7ff7e42c00,
name=name@entry=0x7f7ff5e6324e "isc-timer", arg=arg@entry=0x0)
at /usr/src/lib/libpthread/pthread.c:792
792 {
(gdb) dis 2
(gdb) c
Continuing.
Reading /usr/lib/i18n/libUTF8.so.5.0 from remote target...
Reading /usr/lib/i18n/libUTF8.so.5.0.debug from remote target...
Reading /usr/lib/i18n/.debug/libUTF8.so.5.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/i18n/libUTF8.so.5.0.debug from remote target...
Reading /usr/libdata/debug//usr/lib/i18n/libUTF8.so.5.0.debug from remote target...

Then, back to the first terminal:

> netbsd.org
Server:         62.179.1.62
Address:        62.179.1.62#53

Non-authoritative answer:
Name:   netbsd.org
Address: 199.233.217.205
Name:   netbsd.org
Address: 2001:470:a085:999::80
> exit

Thread 1 hit Breakpoint 3, exit (status=1) at /usr/src/lib/libc/stdlib/exit.c:55
55      {
(gdb) info threads 
  Id   Target Id          Frame 
* 1    Thread 28353.28353 exit (status=1) at /usr/src/lib/libc/stdlib/exit.c:55
(gdb) bt
#0  exit (status=1) at /usr/src/lib/libc/stdlib/exit.c:55
#1  0x0000000000206122 in ___start ()
#2  0x00007f7ff7c0c840 in ?? () from target:/usr/libexec/ld.elf_so
#3  0x0000000000000001 in ?? ()
#4  0x00007f7fffffed20 in ?? ()
#5  0x0000000000000000 in ?? ()
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 28353) killed]

It worked!

In order to get this functionality operational I had to implement multiple GDB functions, in particular: create_inferior, post_create_inferior, attach, kill, detach, mourn, join, thread_alive, resume, wait, fetch_registers, store_registers, read_memory, write_memory, request_interrupt, supports_read_auxv, read_auxv, supports_hardware_single_step, sw_breakpoint_from_kind, supports_z_point_type, insert_point, remove_point, stopped_by_sw_breakpoint, supports_qxfer_siginfo, qxfer_siginfo, supports_stopped_by_sw_breakpoint, supports_non_stop, supports_multi_process, supports_fork_events, supports_vfork_events, supports_exec_events, supports_disable_randomization, supports_qxfer_libraries_svr4, qxfer_libraries_svr4, supports_pid_to_exec_file, pid_to_exec_file, thread_name, supports_catch_syscall.

NetBSD is the first BSD and actually the first Open Source UNIX-like OS besides Linux to grow support for gdbserver.

Plan for the next milestone

Introduce AArch64 support for GDB/NetBSD.

GSoC 2020: Report-2: Fuzzing the NetBSD Network Stack in a Rumpkernel Environment

2020-08-30T13:21:57+00:00

This report was written by Nisarg S. Joshi as part of Google Summer of Code 2020.

The objective of this project is to fuzz the various protocols and layers of the network stack of NetBSD using rumpkernel. This project is being carried out as a part of GSoC 2020. This blog post is regarding the project, the concepts and tools involved, the objectives and the current progress and next steps.

You can read the previous post/report here.

Overview of the work done:

The major time of the phase 1 and 2 were spent in analyzing the input and output paths of the particular protocols being fuzzed. During that time, 5 major protocols of the internet stack were taken up:

IPv4 (Phase 1)
UDP (Phase 1)
IPv6 (Phase 2)
ICMP (Phase 2)
Ethernet (Phase 2)

Quite a good amount of time was spent in understanding the input and output processing functions of the particular protocols, the information gathered was to be applied in packet creation code for that protocol. This is important so that we know which parts of the packet can be kept random by the fuzzer based input and which part of the packet need to be set to proper fixed values. Fixing some values in the data packet to be correct is important so that the packet does not get rejected for trivial cases like IP Protocol Version or Internet Checksum. (The procedure to come up with the decisions and the code design and flow is explained in IPv4 Protocol section as an example)

For each protocol, mainly 2 things needed to be implemented:

The Network Config: the topology for sending and receiving packets example using a TUN device or a TAP device, the socket used and so on. Configuring these devices was the first step in being able to send or receive packets
Packet Creation: Using the information gathered in the code walkthrough of the protocol functions, packet creation is decided where certain parts of the packet are kept fixed and others random from the fuzzer input itself. Doing so we try to gain maximum code coverage. Also one thing to be noted here, we should not randomly change the fuzzer input, rather do it deterministically following the same rules for each packet, otherwise the evolutionary fuzzer cannot easily grow the corpus.

In the next section, a few of the protocols will be explained in detail.

Protocols

In this section we will talk about the various protocols implemented for fuzzing and talk about the approach taken to create a packet.

IPv4:

IPv4 stands for the Internet protocol version 4. It is one of the most widely used protocols in the internet family of protocols. It is used as a network layer protocol for routing and host to host packet delivery using an addressing scheme called IP Address(A 32 bit address scheme). IP Protocol also handles a lot of other functions like fragmentation and reassembly of packets to accommodate for transmission of packets over varying sizes of physical channel capacities. It also supports the concept of multicasting and broadcasting (Via IP Options).

In order to come up with a strategy for fuzzing, the first step was to carry out a code walkthrough of relevant functions/APIs and data structures involved in the IPv4 protocol. For that the major files and components studied were:

ip_input() => Which carries out the processing of a incoming packet at the network layer for IPv4 (src here)
ip_output() => Which carries out the processing of an outgoing packet at the network layer for IPv4 (src here)
struct ip => Represents the IP header (src here)

These sections of code represent the working of the input and output processing paths of IPv4 protocol and the struct ip is the main IPv4 header. On top of that other APIs related to mbuf (The NetBSD packet), ip_forward(), IP assembly and fragmentation etc. were also studied in order to determine information about packet structure that could be followed.

In order to be able to reach these various aspects of the protocol and be able to fuzz it, we went forward with packet creation that took care of basic fields of the IP Header so that it would not get rejected in trivial cases as mentioned before. Hence we went ahead and fixed these fields:

IP Version: Set it to 0x4 which is a 4 bit value.
IP Header Len: Which is set to a value greater than or equal to sizeof(struct ip). Setting this to greater than that allows for IP Options processing.
IP Len: Set it to the size of the random buffer passed by fuzzer.
IP Checksum: We calculate the correct checksum for the packet using the internet checksum algorithm.

Other fields were allowed to be populated randomly by fuzzer input. Here is an illustration of the IPv4 header with the fields marked in red as fixed.

The packet creation code lies in the following section inside [pkt_create.c]. Another important component is the network configuration [located here net_config] where the code related to configuring a TUN/TAP device is present. All the code uses the rumpkernel exposed APIs and syscalls (prepended with rump_sys_) so as to utilize the rumpkernel while executing the application binary. After packet creation and network config is handled the main fuzzing function is written where a series of steps are followed:

We call rump_init() to initialize the rumpkernel linked via libraries
We setup the Client and server IP addresses
We setup the TUN device by calling the network config functions described above
We create the packet using the packet creation function utilizing the random buffer passed by the fuzzer and transforming that into a semi-random buffer.
Pass this forged packet into the network stack of the rumpkernel linked with the application binary by calling rump_sys_write on the TUN device setup.

IPv6:

IPv6 stands for the Internet protocol version 4. It is the successor of the IPv4 protocol. It came into existence in order to overcome the addressing requirements that could not fit in a 32 bit IPv4 address. It is used as a network layer protocol for routing and host to host packet delivery using an addressing scheme called IPv6 Address(A 128 bit address scheme). It also supports almost similar other functions as IPv4 except some things like fragmentation, broadcast(instead uses multicast).

IP Version: Set it to 0x6 which is a 4 bit value.
IP Hop Limit: This is an alias for TTL. Set it to a maximum possible value of 255(8 bits).

Other fields were allowed to be populated randomly by fuzzer input. Allowing the payload len value to be randomly populated allowed processing of various “next headers” or ”Extension headers”. Extension headers carry optional Internet Layer information, and are placed between the fixed header and the upper-layer protocol header. The headers form a chain, using the Next Header fields. The Next Header field in the fixed header indicates the type of the first extension header; the Next Header field of the last extension header indicates the type of the upper-layer protocol header in the payload of the packet. A further work can be done to set the value of the next header chain and form packets for multiple scenarios with a combination of various next headers.

UDP:

UDP stands for User Datagram Protocol. It is one of the simplest protocols and is designed to be simple so that it simply carries payload with minimal overhead. It does not have many options except for checksum information and ports in order to demultiplex the packet to the processes.

Since UDP runs at the transport layer and hence is wrapped up in an IP header. Since we do not want to fuzz the IP code section, we form a well formed IP header so that the packet does not get rejected in the IP processing section. We only randomize the UDP header using the fuzzer input. We used previously built out IP packet creation utilities to form the IP header and then use the fuzzer input for UDP header.

In UDP, we fix the following fields:

UDP Checksum: Set it to zero in order to avoid checksums.

ICMP:

ICMP stands for Internet control message protocol. This protocol is sometimes called a sister protocol of IP protocol and is used as a troubleshooting protocol at the network layer. It is used for major 2 purposes:

Error messages
Request-Reply Queries.

ICMP has a lot of options and is quite generic in the sense that it handles a lot of error messages and queries. Although ICMP is generally considered at the network layer, it is actually wrapped inside an IP header, hence it has its own protocol number(= 1). Again similar to UDP, we wrap the ICMP headers inside IP headers, hence we do not randomize the IP header and only the ICMP headers using fuzzer input.

In order to test various ICMP messages and queries, we could not fix values for the type and code fields in the ICMP header since they decide the ICMP message type. Also if we allowed random input, most of the packets would get rejected since the number of options of type and code fields are limited and most other values would discard the packet while processing. Hence we came up with a solution where we deterministically modified the input bits from the fuzzer corresponding to the code and type fields. For the type field we simply took a modulo of the number of types(ICMP_NTYPES macro used here). For the value of code , we had to fix values in a certain range based on the type value set already. This technique allowed us to cover all different ICMP message types via the fuzzer input. We also ensured that the input buffer was not modified completely randomly, since that is a bad practice for a feedback-driven fuzzer like ours. Apart from this we fixed the ICMP Checksum field as well by calculating the checksum using the internet checksum algorithm.

Ethernet:

Ethernet protocol defined by the IEEE 802.3 standard is a widely used data link layer protocol. The ethernet packet called a frame carries an IP(or the network layer protocol) datagram. The header is simple with Link Layer Addresses called MAC address (used for switching at data link layer which is a part of addressing), for source and destination each of 6 octets(=48 bytes) present, followed by a 4 octet Ethertype and QTag field. This is followed by payload and finally the FCS(frame check sequence) which is a four-octet cyclic redundancy check (CRC) that allows detection of corrupted data within the entire frame as received on the receiver side.

In case of Ethernet protocol fuzzing, we had to use a TAP device instead of a TUN device, since the TUN device supports passing an IP packet to the network stack, whereas a TAP device accepts an ethernet frame.

For packet creation, we set the source and destination MAC address and let the payload and ethertype be randomly populated by the fuzzer.

Current Progress and Next steps

The project currently has reached a stage where many major internet family protocols have been covered for fuzzing. As described above a structured approach to fuzzing them have been taken by forming packets based on the internal workings of the protocols. Also as mentioned in the previous post, Rumpkernel environment is being used for fuzzing all these protocols. In order to get better results as compared to raw fuzzing, we have taken these steps. In the next report we shall talk about and compare the coverage of raw fuzzing with our approach.

For the next phase of GSoC, the major focus would be to validate this process of fuzzing by various methods to check the penetration of packets into the network stack as well as the code coverage. Also the code would be made more streamlined and standardized so that it can be extended for adding more protocols even beyond the scope of the GSoC project.

GSoC 2020 Second Evaluation Report: Curses Library Automated Testing

2020-08-07T11:21:00+00:00

This report was prepared by Naman Jain as a part of Google Summer of Code 2020

My GSoC project under NetBSD involves the development of test framework of curses library. This blog report is second in series of blog reports; you can have a look at the first report. This report would cover the progress made in second coding phase along with providing some insights into the libcurses.

Complex characters

A complex character is a set of associated character, which may include a spacing character and non-spacing characters associated with it. Typical effects of non-spacing character on associated complex character c include: modifying the appearance of c (like adding diacritical marks) or bridge c with the following character. The cchar_t data type represents a complex character and its rendition. In NetBSD, this data type has following structure:

struct cchar_t {
	attr_t attributes; /* character attributes */
	unsigned elements; /* number of wide char in vals*/
	wchar_t vals[CURSES_CCHAR_MAX]; /* wide chars including non-spacing */
};

vals array contains the spacing character and associated non-spacing characters. Note that NetBSD supports wchar_t (wide character) due to which multi-byte characters are supported. To use the complex characters one has to correctly set the locale settings. In this coding period, I wrote tests for routines involving complex characters.

Alternate character set

When you print "BSD", you would send the hex-codes 42, 53, 44 to the terminal. Capability of graphic capable printers was limited by 8-bit ASCII code. To solve this, additional character sets were introduced. We can switch between the modes using escape sequence. One such character set for Special Graphics is used by curses for line drawing. In a shell you can type

echo -e "\e(0j\e(b"

to get a lower-right corner glyph. This enables alternate character mode (\e(), prints a character(j) and disables alternate character mode (\e(b). One might wonder where this 'j' to 'Lower Right Corner glyph' comes from. You may see that mapping ("acsc=``aaffggiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,) via

infocmp -1 $TERM | grep acsc

These characters are used in box_set(), border_set(), etc. functions which I tested in the second coding period.

Progress in the second coding phase

Improvements in the framework:

Added support for testing of functions to be called before initscr()
Updated the unsupported function definitions with some minor bug fixes.

Testing and bug reports

Added tests for following families of functions:
- Complex character routines.
- Line/box drawing routines.
- Pad routines.
- Window and sub-window operations.
- Curson manipulation routines
Reported bugs (and possible fixes if I know):
- lib/55454 wredrawln() in libcurses does not follow the sensible behaviour [fixed]
- lib/55460 copy error in libcurses [fixed]
- lib/55474 wattroff() unsets all attributes if passed STANDOUT as argument [standard is not clear, so decided to have as it is]
- lib/55482 slk_restore() does not restore the slk screen
- lib/55484 newwin() results into seg fault [fixed]
- lib/55496 bkgrnd() doesn't work as expected
- lib/55517 wresize() function incorrectly resizes the subwindows

I would like to thank my mentors Brett and Martin, as well as the NetBSD community for their support whenever I faced some issues.

GSoC Reports: Fuzzing Rumpkernel Syscalls, Part 2

2020-08-05T08:42:37+00:00

This report was prepared by Aditya Vardhan Padala as a part of Google Summer of Code 2020

I have been working on Fuzzing Rumpkernel Syscalls. This blogpost details the work I have done during my second coding period.

Reproducing crash found in ioctl()

Kamil has worked on reproducing the following crash

Thread 1 "" received signal SIGSEGV, Segmentation fault.
pipe_ioctl (fp=<optimized out>, cmd=<optimized out>, data=0x7f7fffccd700)
    at /usr/src/lib/librump/../../sys/rump/../kern/sys_pipe.c:1108
warning: Source file is more recent than executable.
1108                            *(int *)data = pipe->pipe_buffer.cnt;
(gdb) bt
#0  pipe_ioctl (fp=<optimized out>, cmd=<optimized out>, data=0x7f7fffccd700)
    at /usr/src/lib/librump/../../sys/rump/../kern/sys_pipe.c:1108
#1  0x000075b0de65083f in sys_ioctl (l=<optimized out>, uap=0x7f7fffccd820, retval=<optimized out>)
    at /usr/src/lib/librump/../../sys/rump/../kern/sys_generic.c:671
#2  0x000075b0de6b8957 in sy_call (rval=0x7f7fffccd810, uap=0x7f7fffccd820, l=0x75b0de126500, 
    sy=<optimized out>) at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:65
#3  sy_invoke (code=54, rval=0x7f7fffccd810, uap=0x7f7fffccd820, l=0x75b0de126500, sy=<optimized out>)
    at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:94
#4  rump_syscall (num=num@entry=54, data=data@entry=0x7f7fffccd820, dlen=dlen@entry=24, 
    retval=retval@entry=0x7f7fffccd810)
    at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:769
#5  0x000075b0de6ad2ca in rump___sysimpl_ioctl (fd=<optimized out>, com=<optimized out>, 
    data=<optimized out>) at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:979
#6  0x0000000000400bf7 in main (argc=1, argv=0x7f7fffccd8c8) at test.c:15

in the rump using a fuzzer that uses pip2, dup2 and ioctl syscalls and specific arguments that are known to cause a crash upon which my work develops.

https://github.com/adityavardhanpadala/rumpsyscallfuzz/blob/master/honggfuzz/ioctl/ioctl_fuzz2.c

Since rump is a multithreaded process. Crash occurs in any of those threads. By using a core dump we can quickly investigate the crash and fetch the backtrace from gdb for verification however this is not viable in the long run as you would be loading your working directory with lots of core dumps which consume a lot of space. So we need a better way to reproduce crashes.

Crash Reproducers

Getting crash reproducers working took quite a while. If we look at HF_ITER() function in honggfuzz, it is a simple wrapper for HonggfuzzFetchData() to fetch buffer of fixed size from the fuzzer.

void HonggfuzzFetchData(const uint8_t** buf_ptr, size_t* len_ptr) {
.
.
.
.
    *buf_ptr = inputFile; 
    *len_ptr = (size_t)rcvLen;
.
.
}

And if we observe the attribute we notice that inputFile is mmaped.

//libhfuzz/fetch.c:26
    if ((inputFile = mmap(NULL, _HF_INPUT_MAX_SIZE, PROT_READ, MAP_SHARED, _HF_INPUT_FD, 0)) ==
        MAP_FAILED) {
        PLOG_F("mmap(fd=%d, size=%zu) of the input file failed", _HF_INPUT_FD,
            (size_t)_HF_INPUT_MAX_SIZE);
    }

So in a similar approach HF_ITER() can be modified to read input from a file and be mmapped so that we can reuse the reproducers generated by honggfuzz.

Attempts have been made to use getchar(3) for fetching the buffer via STDIN but for some unknown reason it failed so we switched to mmap(2)

So we overload HF_ITER() function whenever we require to reproduce a crash. I chose the following approach to use the reproducers. So whenever we need to reproduce a crash we just define CRASH_REPR.


static
void Initialize(void)
{
#ifdef CRASH_REPR
    FILE *fp = fopen(argv[1], "r+");
    data = malloc(max_size);
    fread(data, max_size, 1, fp);
    fclose(fp);
#endif
    // Initialise the rumpkernel only once.
    if(rump_init() != 0)
        __builtin_trap();
}

#ifdef CRASH_REPR
void HF_ITER(uint8_t **buf, size_t *len) {
        *buf = (uint8_t *)data;
        *len = max_size;
        return;
}
#else
EXTERN void HF_ITER(uint8_t **buf, size_t *len);
#endif

This way we can easily reproduce crashes that we get and get the backtraces.

Generating C reproducers

Now the main goal is to create a c file which can reproduce the same crash occuring due to the reproducer. This is done by writing all the syscall executions to a file with arguments so they can directly be compiled and used.

#ifdef CRASH_REPR
        FILE *fp = fopen("/tmp/repro.c","a+");
        fprintf(fp, "rump_sys_ioctl(%" PRIu8 ", %" PRIu64 ");\n",get_u8(),get_ioctl_request());
        fclose(fp);
#else    
        rump_sys_ioctl(get_u8(), get_ioctl_request());
#endif

I followed the same above method for all the syscalls that are executed. So I get a proper order of syscalls executed in native c code that I can simply reuse.

Obstacles

The number of times each syscall is executed before getting to a crash is quite high. So trying to perform a write to a file or STDOUT will create a lot of overhead when the number of syscalls executed are quite high. This method is good enough but a bit of optimization will make it even better.

To-Do

./build.sh building rump on linux+netbsd
pregenerating fuzzer input using the implementation similar to that used in syzkaller.

Finally I thank my mentors Siddharth Muralee, Maciej Grochowski, Christos Zoulas for their guidance and Kamil Rytarowski for his constant support whenever I needed it.

GSoC Reports: Enhancing Syzkaller support for NetBSD, Part 2

2020-08-05T08:10:10+00:00

This report was prepared by Ayushi Sharma as a part of Google Summer of Code 2020

As a part of Google summer code 2020, I have been working on Enhance the Syzkaller support for NetBSD. This post summarises the work done in the past month.

For work done in the first coding period, you can take a look at the previous post.

Automation for enhancement

With an aim of increasing the number of syscalls fuzzed, we have decided to automate the addition of descriptions for syscalls as well as ioctl device drivers in a customised way for NetBSD.

Design

All the ioctl commands for a device driver in NetBSD are stored inside the /src/sys/dev/<driver_name>/ folder. The idea is to get information related to a particular ioctl command by extracting required information from the source code of drivers. To achieve the same, we have broken down our project into majorly three phases.

Generating preprocessed files
Extracting information required for generating descriptions
Conversion to syzkaller’s grammar

Generating Preprocessed files

For a given preprocessed file, c2xml tool outputs the preprocessed C code in xml format. Further, the intermediate xml format descriptions would help us to smoothly transform the c code to syzkaller specific descriptions, in the last stage of this tool. We have used Bear as an aid for fetching commands to preprocess files for a particular driver. Bear generates a file called compile_commands.json which stores the commands used for compiling a file in json format. We then run these commands with ‘-E’ gcc flag to fetch the preprocessed files.These preprocessed files then serve as an input to the c2xml program.

Extractor

Definition of ioctl calls defined in header files of device driver in NetBSD can be broken down to:

When we see it from syzkaller’s perspective, there are basically three significant parts we need to extract for adding description to syzkaller.

Description of a particular ioctl command acc to syzkaller’s grammar:

ioctl$FOOIOCTL(fd <fd_driver>, cmd const[FOOIOCTL], pt ptr[DIR, <ptr_type>])

These definitions can be grepped from a device’s header files. The type information or description for pointer can then be extracted from the output files generated by c2xml. If the third argument is a struct, the direction of the pointer is determined with the help of fun() macros.

To-Do

The extracted descriptions have to be converted into syzkaller-friendly grammer. We plan to add support for syscalls too , which would ease the addition of complex compat syscalls. This would help us to increase the syzkaller’s coverage significantly.

Stats

Along with this, We have continued to add support for few more syscalls these include:

ksem(2) family
mount(2) family

Syscalls related to sockets have also been added. This has increased syscall coverage percentage to 50.35.

Atlast, I would like to thank my mentors - Cryo, Siddharth Muralee and Santhosh along with Kamil for their guidance and support. I am thankful to NetBSD community too along with Google for providing me such an amazing opportunity.

The GNU GDB Debugger and NetBSD (Part 3)

2020-08-04T16:45:37+00:00

The NetBSD team of developers maintains two copies of GDB:

One in the base-system with a stack of local patches.
One in pkgsrc with mostly build fix patches.

The base-system version of GDB (GPLv3) still relies on a set of local patches. I set a goal to reduce the local patches to bare minimum, ideally reaching no local modifications at all.

GDB changes

I've written an integration of GDB with fork(2) and vfork(2) events. Unfortunately, this support (present in a local copy of GDB in the base-system) had not been merged so far, because there is a generic kernel regression with the pg_jobc variable. This variable can be called a reference counter of the number of processes within a process group that has a parent with control over a terminal. The semantics of this variable are not very well defined and in the result the number can become negative. This unexpected state of pg_jobc resulted in spurious crashes during kernel fuzzing. As a result new kernel assertions checking for non-negative pg_jobc values were introduced in order to catch the anomalies quickly. GDB as a ptrace(2)-based application happened to reproduce negative pg_jobc values quickly and reliably and this stopped the further adoption of the fork(2) and vfork(2) patch in GDB, until the pg_jobc behavior is enhanced. I was planning to include support for posix_spawn(3) events as well, as they are implemented as a first-class operation through a syscall, however this is also blocked by the pg_jobc blocker.

A local patch for GDB is stored here for the time being.

I've enable multi-process mode in the NetBSD native target. This enabled proper support for multiple inferiors and ptrace(2) assisted management of the inferior processes and their threads.

    
    (gdb) info inferior
      Num  Description       Connection           Executable
    * 1    process 14952     1 (native)           /usr/bin/dig
      2    <null>            1 (native)
      3    process 25684     1 (native)           /bin/ls
      4    <null>            1 (native)           /bin/ls

Without this change, additional inferiors could be already added, but not properly controlled.

I've implemented the xfer_partial TARGET_OBJECT_SIGNAL_INFO support for NetBSD. NetBSD implements reading and overwriting siginfo_t received by the tracee. With TARGET_OBJECT_SIGNAL_INFO signal information can be examined and modified through the special variable $_siginfo. Currently NetBSD uses an identical siginfo type on all architectures, so there is no support for architecture-specific fields.

(gdb) b main
Breakpoint 1 at 0x71a0
(gdb) r
Starting program: /bin/ps 

Breakpoint 1, 0x00000000002071a0 in main ()
(gdb) p $_siginfo
$1 = {si_pad = {5, 0, 0, 0, 1, 0 , 1, 0 }, _info = {_signo = 5, 
    _code = 1, _errno = 0, _pad = 0, _reason = {_rt = {_pid = 0, _uid = 0, _value = {sival_int = 1, 
          sival_ptr = 0x1}}, _child = {_pid = 0, _uid = 0, _status = 1, _utime = 0, _stime = 0}, 
      _fault = {_addr = 0x0, _trap = 1, _trap2 = 0, _trap3 = 0}, _poll = {_band = 0, _fd = 1}, 
      _syscall = {_sysnum = 0, _retval = {0, 1}, _error = 0, _args = {0, 0, 0, 0, 0, 0, 0, 0}}, 
      _ptrace_state = {_pe_report_event = 0, _option = {_pe_other_pid = 0, _pe_lwp = 0}}}}}

NetBSD, contrary to Linux and other BSDs, supports a ptrace(2) operation to generate a core(5) file from a running process. This operation is used in the base-system gcore(1) program. The gcore functionality is also delivered by GDB, and I have prepared new code for GDB to wire PT_DUMPCORE into the GDB code for NetBSD, and thus support GDB's gcore functionality. This patch is still waiting in upstream review. A local copy of the patch is here.

(gdb) r
Starting program: /bin/ps 

Breakpoint 1, 0x00000000002071a0 in main ()
(gdb) gcore
Saved corefile core.4378
(gdb) !file core.4378
core.4378: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), NetBSD-style, from 'ps', pid=4378, uid=1000, gid=100, nlwps=1, lwp=4378 (signal 5/code 1)

Plan for the next milestone

Rewrite the gdbserver support and submit upstream.

GSoC Reports: Enhancing Syzkaller support for NetBSD, Part 1

2020-07-13T22:18:21+00:00

This report was prepared by Ayushi Sharma as a part of Google Summer of Code 2020

I have been working on the project - Enhance the Syzkaller support for NetBSD, as a part of GSoc’20. Past two months have given me quite an enriching experience, pushing me to comprehend more knowledge on fuzzers. This post would give a peek into the work which has been done so far.

Syzkaller

Syzkaller is a coverage guided fuzzer developed by Google, to fuzz the system calls mainly keeping linux in mind. However, the support has been extended to fuzz the system calls of other Operating Systems as well for eg. Akaros, FreeBSD, NetBSD, OpenBSD, Windows etc.

An automated system Syzbot continuously runs the syzkaller fuzzer on NetBSD kernel and reports the crashes

Increasing syscall support

Initially, the syscall support for Linux as well as FreeBSD was analysed by an automated script. Also coverage of NetBSD syscalls was looked over. This helped us to easily port a few syscalls descriptions for NetBSD. The necessary tweaks were made according to the documentation which describes rules for writing syscall descriptions.

Major groups of syscalls which have been added:

statfs
__getlogin
getsid
mknod
utimes
wait4
seek
setitimer
setpriority
getrusage
clock_settime
nanosleep
getdents
acct
dup

Bugs Found

There were a few bugs reported as a result of adding the descriptions for syscalls of the mentioned syscall families. Few of them are yet to be fixed.

Stats

Syscall coverage percent for NetBSD has now increased from nearly 30 to 44.96. Addition of compat syscalls resulted in getting a few new bugs.

Percentage of syscalls covered in few of the other Operating Systems:

Linux: 82
FreeBSD: 37
OpenBSD: 61

Conclusion

In the next phase I would be working on generating the syscall descriptions using automation and adding ioctl device drivers with it’s help.

Atlast, I would like to thank my mentors Cryo, Siddharth and Santhosh for their constant support and guidance.I am also thankful to NetBSD community for being kind to give me this opportunity of having an amazing summer this year.

GSoC Reports: Extending the functionality of NetPGP, Part 1

2020-07-13T22:05:18+00:00

This report was prepared by Jason High as a part of Google Summer of Code 2020

NetPGP is a library and suite of tools implementing OpenPGP under a BSD license. As part of Google Summer of Code 2020, we are working to extend its functionality and work towards greater parity with similar tools. During the first phase, we have made the following contributions

Added the Blowfish block cipher
ECDSA key creation
ECDSA signature and verification
Symmetric file encryption/decryption
S2K Iterated+Salt for symmetric encryption

ECDSA key generation is done using the '--ecdsa' flag with netpgpkeys or the 'ecdsa' property if using libnetpgp

[jhigh@gsoc2020nb gsoc]$ netpgpkeys --generate-key --ecdsa --homedir=/tmp
signature  secp521r1/ECDSA a0cdb04e3e8c5e34 2020-06-25 
fingerprint d9e0 2ae5 1d2f a9ae eb96 ebd4 a0cd b04e 3e8c 5e34 
uid              jhigh@localhost
Enter passphrase for a0cdb04e3e8c5e34: 
Repeat passphrase for a0cdb04e3e8c5e34: 
generated keys in directory /tmp/a0cdb04e3e8c5e34
[jhigh@gsoc2020nb gsoc]$ 

[jhigh@gsoc2020nb gsoc]$ ls -l /tmp/a0cdb04e3e8c5e34
total 16
-rw-------  1 jhigh  wheel  331 Jun 25 16:03 pubring.gpg
-rw-------  1 jhigh  wheel  440 Jun 25 16:03 secring.gpg
[jhigh@gsoc2020nb gsoc]$

Signing with ECDSA does not require any changes

[jhigh@gsoc2020nb gsoc]$ netpgp --sign --homedir=/tmp/a0cdb04e3e8c5e34 --detach --armor testfile.txt 
signature  secp521r1/ECDSA a0cdb04e3e8c5e34 2020-06-25 
fingerprint d9e0 2ae5 1d2f a9ae eb96 ebd4 a0cd b04e 3e8c 5e34 
uid              jhigh@localhost
netpgp passphrase: 
[jhigh@gsoc2020nb gsoc]$ 

[jhigh@gsoc2020nb gsoc]$ cat testfile.txt.asc 
-----BEGIN PGP MESSAGE-----

wqcEABMCABYFAl71EYwFAwAAAAAJEKDNsE4+jF40AAAVPgIJASyzuZgyS13FHHF/9qk6E3pYra2H
tDdkqxYzNIqKnWHaB+a4J+/R7FkZItbC/EyXH5YA68AC1dJ7tRN/tJNIWfYjAgUb75SvM2mLHk13
qmBo48S0Ai8C82G4nT7/16VF2OOUn7F/3XICghoQOyS1nxJilj8u2uphLOoy9VayL1ErORIZVw==
=p30e
-----END PGP MESSAGE-----
[jhigh@gsoc2020nb gsoc]$

Verification remains the same, as well.

[jhigh@gsoc2020nb gsoc]$ netpgp --homedir=/tmp/a0cdb04e3e8c5e34 --verify testfile.txt.asc 
netpgp: assuming signed data in "testfile.txt"
Good signature for testfile.txt.asc made Thu Jun 25 16:05:16 2020
using ECDSA key a0cdb04e3e8c5e34
signature  secp521r1/ECDSA a0cdb04e3e8c5e34 2020-06-25 
fingerprint d9e0 2ae5 1d2f a9ae eb96 ebd4 a0cd b04e 3e8c 5e34 
uid              jhigh@localhost
[jhigh@gsoc2020nb gsoc]$

Symmetric encryption is now possible using the '--symmetric' flag with netpgp or the 'symmetric' property in libnetpgp

[jhigh@gsoc2020nb gsoc]$ netpgp --encrypt --symmetric --armor testfile.txt 
Enter passphrase: 
Repeat passphrase: 
[jhigh@gsoc2020nb gsoc]$ 

[jhigh@gsoc2020nb gsoc]$ cat testfile.txt.asc 
-----BEGIN PGP MESSAGE-----

wwwEAwEIc39k1V6xVi3SPwEl2ww75Ufjhw7UA0gO/niahHWK50DFHSD1lB10nUyCTgRLe6iS9QZl
av5Nji9BuQFcrqo03I1jG/L9s/4hww==
=x41O
-----END PGP MESSAGE-----
[jhigh@gsoc2020nb gsoc]$

Decryption of symmetric packets requires no changes

[jhigh@gsoc2020nb gsoc]$ netpgp --decrypt testfile.txt.asc 
netpgp passphrase: 
[jhigh@gsoc2020nb gsoc]$

We added two new flags to support s2k mode 3: '--s2k-mode' and '--s2k-count'. See RFC4880 for details.

[jhigh@gsoc2020nb gsoc]$ netpgp --encrypt --symmetric --s2k-mode=3 --s2k-count=96 testfile.txt 
Enter passphrase: 
Repeat passphrase: 
[jhigh@gsoc2020nb gsoc]$ 


[jhigh@gsoc2020nb gsoc]$ gpg -d testfile.txt.gpg 
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
this
is
a
test
[jhigh@gsoc2020nb gsoc]$

GSoC Reports: Curses Library Automated Testing, Part 1

2020-07-13T21:40:13+00:00

This report was prepared by Naman Jain as a part of Google Summer of Code 2020

Introduction

My GSoC project under NetBSD involves the development of test framework of curses library. Automated Test Framework (ATF) was introduced in 2007 but ATF cannot be used directly for curses testing for several reasons most important of them being curses has functions which do timed reads/writes which is hard to do with just piping characters to test applications. Also, stdin is not a tty device and behaves differently and may affect the results. A lot of work regarding this has been done and we have a separate test framework in place for testing curses.

The aim of project is to build a robust test suite for the library and complete the SUSv2 specification. This includes writing tests for the remaining functions and enhancing the existing ones. Meanwhile, the support for complex character function has to be completed along with fixing some bugs, adding features and improving the test framework.

Why did I chose this project?

I am a final year undergraduate at Indian Institute of Technology, Kanpur. I have my majors in Computer Science and Engineering, and I am specifically interested in algorithms and computer systems. I had worked on building and testing a library on Distributed Tracing at an internship and understand the usefulness of having a test suite in place. Libcurses being very special in itself for the testing purpose interested me. Knowing some of the concepts of compiler design made my interest a bit more profound.

Test Framwork

The testframework consists of 2 programs, director and slave. The framework provides its own simple language for writing tests. The slave is a curses application capable of running any curses function, while the director acts as a coordinator and interprets the test file and drives the slave program. The director can also capture slave's output which can be used for comparison with desired output.

The director forks a process operating in pty and executes a slave program on that fork. The master side of pty is used for getting the data stream that the curses function call produces which can be futher used to check the correctness of behaviour. Director and slave communicate via pipes; command pipe and slave pipe. The command pipe carries the function name and arguments, while slave pipe carries return codes and values from function calls.

Let's walk through a sample test to understand how this works. Consider a sample program:

include start
call win newwin 2 5 2 5
check win NON_NULL
call OK waddstr $win "Hello World!"
call OK wrefresh $win
compare waddstr_refresh.chk

This is a basic program which initialises the screen, creates new window, checks if the window creation was successful, adds as string "Hello World!" on the window, refreshes the window, and compares it with desired output stored in check file. The details of the language can be accessed at libcurses testframe.

The test file is interpreted by the language parser and the correponding actions are taken. Let's look how line #2 is processed. This command creates a window using newwin(). The line is ultimately parsed as call: CALL result fn_name args eol grammar rule and executes the function do_funtion_call()). Now, this function sends function name and arguments using command pipe to the slave. The slave, who is waiting to get command from the director, reads from the pipe and executes the command. This executes the correponding curses function from the command table and the pointer to new window is returned via the slave pipe (here) after passing wrappers of functions. The director recieves them, and returned value is assigned to a variable(win in line#2) or compared (OK in line#4). This is the typical life cycle of a certain function call made in tests.

Along with these, the test framework provides capability to include other test (line#1), check the variable content (line#3), compare the data stream due to function call in pty with desired stream (line#6). Tester can also provide inputs to functions via input directive, perform delay via delay directive, assign values to variables via assign directive, and create a wide or complex charater via wchar and cchar directives respectively. The framework supports 3 kind of strings; null terminated string, byte string, and string of type chtype, based on the quotes enclosing it.

Progress till the first evaluation

Improvements in the framework:

Automated the checkfile generation that has to be done manually earlier.
Completed the support for complex chacter tests in director and slave.
Added features like variable-variable comparison.
Fixed non-critical bugs in the framework.
Refactored the code.

Testing and bug reports

Wrote tests for wide character routines.
Reported bugs (and possible fixes if I know):
- lib/55433 Bug in special character handling of ins_wstr() of libcurses
- lib/55434 Bug in hline() in libcurses [fixed]
- lib/55443 setcchar() incorrectly sets the number of elements in cchar [fixed]

Project Proposal and References:

Proposal: https://github.com/NamanJain8/curses/blob/master/reports/proposal.pdf
Project Repo: https://github.com/NamanJain8/curses
Test Language Details: https://github.com/NetBSD/src/blob/trunk/tests/lib/libcurses/testframe.txt
Wonderful report by Brett Lymn: https://github.com/NamanJain8/curses/blob/master/reports/curses-testframe.pdf

GSoC Reports: Fuzzing the NetBSD Network Stack in a Rumpkernel Environment, Part 1

2020-07-13T14:58:58+00:00

This report was prepared by Nisarg Joshi as a part of Google Summer of Code 2020

Introduction:

Fuzzing:

Fuzzing or fuzz testing is an automated software testing technique in which a program is tested by passing unusual, unexpected or semi-random input generated data to the input of the program and repeatedly doing so, trying to crash the program and detect potential bugs or undealt corner cases.

There are several tools available today that enable this which are known as fuzzers. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.

Fuzzers can be of various types like dumb vs smart, generation-based vs mutation-based and so on. A dumb fuzzer generates random input without looking at the input format or model but it can follow some sophisticated algorithms like in AFL, though considered a dumb fuzzer as it just flips bits and replaces bytes, still uses a genetic algorithm to create new test cases, where as a smart fuzzer will follow an input model to generate semi-random data that can penetrate well in the code and trigger more edge cases. Mutation and generation fuzzers handle test case generation differently. Mutation fuzzers mutate a supplied seed input object, while generation fuzzers generate new test cases from a supplied model.

Some examples of popular fuzzers are: AFL(American Fuzzy Lop), Syzkaller, Honggfuzz.

RumpKernel

Kernels can have several different architectures like monolithic, microkernel, exokernel etc. An interesting architecture is the “anykernel” architecture, according to wikipedia, “The "anykernel" concept refers to an architecture-agnostic approach to drivers where drivers can either be compiled into the monolithic kernel or be run as a userspace process, microkernel-style, without code changes.” Rumpkernel is an implementation of this anykernel architecture. In case of the NetBSD rumpkernel, all the NetBSD subsystems like file system, network stack, drivers etc are compiled into standalone libraries that can be linked into any application process to utilize the functionalities of the kernel, like the file system or the drivers. This allows us to run and test various components of NetBSD kernel as a library linked to programs running in the user space.

This idea of rumpkernel is really helpful in fuzzing the components of the kernel. We can fuzz separate subsystems and allow it to crash without having to manage the crash of a running Operating System. Also the fact that context switching has an overhead in case syscalls are made to the kernel, Rumpkernel running in the userspace can eliminate this and save time.(Also since the spectre-meltdown vulnerabilities, system calls have become more costly due to the security reasons)

HonggFuzz + Rumpkernel Network Stack:

In this project we will use the outlined Rumpkernel’s network stack and a fuzzer called the honggfuzz. Honggfuzz is a security oriented, feedback-driven, evolutionary, easy-to-use fuzzer. It is maintained by google.(https://github.com/google/honggfuzz)

The project is hosted on github at: https://github.com/NJnisarg/fuzznetrump/ .The Readme can help in setting it up. We first require NetBSD installed either on a physical machine or on a virtual machine like qemu. Then we will build the NetBSD distribution by grabbing the latest sources(https://github.com/NetBSD/src). We will enable fuzzer coverage by using MKSANITIZER and MKLIBCSANITIZER flags and using the ASan(Address) and UBSan(Undefined Behavior) sanitizers. These sanitizers will help the fuzzer in catching bugs related to undefined behavior and address and memory leaks. After that we will grab one of the fuzzer programs(example: src/hfuzz_ip_output_fuzz.c) and chroot into the newly built distribution from the host NetBSD OS. Then we will compile it using hfuzz-clang by linking the required rumpkernel libraries (for example in our case: rump, rumpnet, rumpnet_netinet and so on). This is where we use the rumpkernel as libraries linked to our program. The program will access the network stack of the linked rumpkernel and the fuzzer will fuzz those components of the kernel. The compiled binary will be run with honggfuzz. Detailed steps are outlined in the Readme of the linked repository.

Our Approach for network stack fuzzing:

We have planned to fuzz various protocols at different layers of the TCP/IP stack. We have started with mostly widely used yet simple protocols like IP(v4), UDP etc. Along the progress of the project, we will be adding support for more L3(and above) protocols like ICMP, IP(v6), TCP as well as L2 protocols like Ethernet as a bit later phase.

The network stack has 2 paths:

Input/ingress path
Output/egress path

A packet is sent down the network stack via a socket from an application from the output path, whereas a packet is said to be received on a network interface into the network stack via the input path. Each network protocol has major input and output APIs for the packet processing. Example IP protocol has an ip_input() function to process an incoming packet and an ip_output() function to process an outgoing packet. We are planning to fuzz each protocol’s output and input APIs by sending the packet via the output path and receiving the packet via input path respectively.

In order to fuzz the output and input path, the network stack setup and configuration we have is as follows:

We have a TUN device to which we can read and write a packet.
We have a socket that is bound to the TUN device, which can send and receive packets
In order to fuzz the input path, we “write” a packet to the TUN interface, which emulates a received packet on the network stack.
In order to fuzz the output path, we send a packet via the socket to the TUN interface to fuzz the output path.

For carrying out all the above setup, we have separated out the common function for creating and configuring the TUN device and socket into a file called “net_config.c”

Also in order to reduce the rejection of packets carrying random data for trivial cases like checksum or header len, we have created functions that create/forge semi-random packets using the input data from honggfuzz. We manipulate certain fields in the packet to ensure that it does not get rejected trivially by the network stack and hence can reach and trigger deeper parts of the code. These functions for packet creations are located in the “pkt_create.c” file. For each protocol we fuzz, we add these functions to forge the headers and the packet. Currently we have support from UDP and IP(v4).

With these building blocks we have written programs like hfuzz_ip_output_fuzz.c, hfuzz_ip_input_fuzz.c etc, which setup the TUN device and socket using net_config.c and after taking the random data from honggfuzz, use it to forge a packet and send it down or up the stack. We compile these programs using hfuzz-clang as mentioned above and run it under honggfuzz to fuzz the network stack’s particular APIs.

Current Progress:

Following things were worked upon in the first phase:

Getting honggfuzz functional for NetBSD(thanks to Kamil for those patches)
Coming up with the strategy for network configuration and packet creation. Writing utilities for the same.
Adding fuzzing code for protocols like IP(v4) and UDP.
Carrying out fuzzing for those protocols.

Next Steps:

As next steps following things are planned for upcoming phase:

Making changes and improvements by taking suggestions from the mentors.
Adding support for ICMP, IP(v6), TCP and later on for Ethernet.
Analyze and come up with effective ways to improve the fuzzing by focusing on the packet creation part.
Standardize the code to be extensible for adding future protocols.

GSoC Reports: Make system(3) and popen(3) use posix_spawn(3) internally, Part 1

2020-07-13T14:11:35+00:00

This report was prepared by Nikita Ronja Gillmann as a part of Google Summer of Code 2020

This is my first report for the Google Summer of Code project I am working on for NetBSD.

After 1 week of reading POSIX and writing code, 2 weeks of coding and another 1.5 weeks of bugfixes I have successfully implemented posix_spawn in usage in system(3) and popen(3) internally.

The biggest challenge for me was to understand POSIX, to read the standard. I am used to reading more formal books, but I can't remember working with the posix standard directly before.

The next part of my Google Summer of Code project will focus on similar rewrites of NetBSD's sh(1).

system(3)

The prototype

int system(const char *command);

remains the same. Below I'm just commenting on the differences, not the whole function, and therefore only include code block where the versions differ the most. The full work can be found at gsoc2020 as well as src and will be submitted for inclusion later in the project.

Previously we'd use vfork, sigaction, and execve in this stdlib function.

The biggest difference to the 2015 version of our system version is the usage of posix_spawnattr_ where we'd use sigaction before, and posix_spawn where execve executes the command in a vfork'd child:

   posix_spawnattr_init(&attr);
   posix_spawnattr_setsigmask(&attr, &omask);
   posix_spawnattr_setflags(&attr, POSIX_SPAWN_SETSIGDEF|POSIX_SPAWN_SETSIGMASK);
   (void)__readlockenv();
   status = posix_spawn(&pid, _PATH_BSHELL, NULL, &attr, __UNCONST(argp), environ);
   (void)__unlockenv();
   posix_spawnattr_destroy(&attr);

The full version can be found here.

The prototype of posix_spawn is:

int posix_spawn(pid_t *restrict pid, const char *restrict path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]);

We first initialize a spawn attributes object with the default value for all of its attributes set. A spawn attributes object is used to modify the behavior of posix_spawn().

The previous fork-exec switch included calls to sigaction to set the behavior associated with SIGINT and SIGQUIT as defined by POSIX:

The system() function shall ignore the SIGINT and SIGQUIT signals, and shall block the SIGCHLD signal, while waiting for the command to terminate. If this might cause the application to miss a signal that would have killed it, then the application should examine the return value from system() and take whatever action is appropriate to the application if the command terminated due to receipt of a signal. source: https://pubs.opengroup.org/onlinepubs/9699919799/functions/system.html

This has been achieved with a combination of posix_spawnattr_setsigmask() and posix_spawnattr_setflags() in the initialized attributes object referenced by attr.

As before we call __readlockenv() and then call posix_spawn() which returns the process ID of the child process in the variable pointed to by 'pid', and returns zero as the function return value.

The old code:

   (void)__readlockenv();
   switch(pid = vfork()) {
   case -1:                        /* error */
           (void)__unlockenv();
           sigaction(SIGINT, &intsa, NULL);
           sigaction(SIGQUIT, &quitsa, NULL);
           (void)sigprocmask(SIG_SETMASK, &omask, NULL);
           return -1;
   case 0:                         /* child */
           sigaction(SIGINT, &intsa, NULL);
           sigaction(SIGQUIT, &quitsa, NULL);
           (void)sigprocmask(SIG_SETMASK, &omask, NULL);
           execve(_PATH_BSHELL, __UNCONST(argp), environ);
           _exit(127);
   }
   (void)__unlockenv();

popen(3), popenve(3)

As with system, the prototype of both functions remains the same:

FILE * popenve(const char *cmd, char *const *argv, char *const *envp, const char *type); FILE * popen(const char *cmd, const char *type);

pdes_child, an internal function in popen.c, now takes one more argument (const char *cmd) for the command to pass to posix_spawn which is called in pdes_child.

pdes_child previously looked like this:

static void
pdes_child(int *pdes, const char *type)
{
        struct pid *old;

        /* POSIX.2 B.3.2.2 "popen() shall ensure that any streams
           from previous popen() calls that remain open in the 
           parent process are closed in the new child process. */
        for (old = pidlist; old; old = old->next)
#ifdef _REENTRANT
                (void)close(old->fd); /* don't allow a flush */
#else
                (void)close(fileno(old->fp)); /* don't allow a flush */
#endif

        if (type[0] == 'r') {
                (void)close(pdes[0]);
                if (pdes[1] != STDOUT_FILENO) {
                        (void)dup2(pdes[1], STDOUT_FILENO);
                        (void)close(pdes[1]);
                }
                if (type[1] == '+')
                        (void)dup2(STDOUT_FILENO, STDIN_FILENO);
        } else {
                (void)close(pdes[1]);
                if (pdes[0] != STDIN_FILENO) {
                        (void)dup2(pdes[0], STDIN_FILENO);
                        (void)close(pdes[0]);
                }
        }
}

This is the new version (the whole file is here):

static int
pdes_child(int *pdes, const char *type, const char *cmd)
{
        struct pid *old;
        posix_spawn_file_actions_t file_action_obj;
        pid_t pid;
        const char *argp[] = {"sh", "-c", NULL, NULL};
        argp[2] = cmd;
        int error;

        error = posix_spawn_file_actions_init(&file_action_obj);
        if (error) {
                goto fail;
        }
        /* POSIX.2 B.3.2.2 "popen() shall ensure that any streams
           from previous popen() calls that remain open in the
           parent process are closed in the new child process. */
        for (old = pidlist; old; old = old->next)
#ifdef _REENTRANT
        error = posix_spawn_file_actions_addclose(&file_action_obj, old->fd); /* don't allow a flush */
        if (error) {
                goto fail;
        }
#else
        error = posix_spawn_file_actions_addclose(&file_action_obj, fileno(old->fp)); /* don't allow a flush */
        if (error) {
                goto fail;
        }
#endif
        if (type[0] == 'r') {
                error = posix_spawn_file_actions_addclose(&file_action_obj, pdes[0]);
                if (error) {
                        goto fail;
                }
                if (pdes[1] != STDOUT_FILENO) {
                        error = posix_spawn_file_actions_adddup2(&file_action_obj, pdes[1], STDOUT_FILENO);
                        if (error) {
                                goto fail;
                        }
                        error = posix_spawn_file_actions_addclose(&file_action_obj, pdes[1]);
                        if (error) {
                                goto fail;
                        }
                }
                if (type[1] == '+') {
                        error = posix_spawn_file_actions_adddup2(&file_action_obj, STDOUT_FILENO, STDIN_FILENO);
                        if (error) {
                                goto fail;
                        }
                }
        } else {
                error = posix_spawn_file_actions_addclose(&file_action_obj, pdes[1]);
                if (error) {
                        goto fail;
                }
                if (pdes[0] != STDIN_FILENO) {
                        error = posix_spawn_file_actions_adddup2(&file_action_obj, pdes[0], STDIN_FILENO);
                        if (error) {
                                goto fail;
                        }
                        error = posix_spawn_file_actions_addclose(&file_action_obj, pdes[0]);
                        if (error) {
                                goto fail;
                        }
                }
        }
        (void)__readlockenv();
        error = posix_spawn(&pid, _PATH_BSHELL, &file_action_obj, 0, __UNCONST(argp), environ);
        if (error) {
                (void)__unlockenv();
                goto fail;
        }
        (void)__unlockenv();
        error = posix_spawn_file_actions_destroy(&file_action_obj);
        /*
         * TODO: if _destroy() fails we have to go on, otherwise we
         * leak the pid.
         */
        if (error) {
                errno = error;
                return -1;
        }
        return pid;

fail:
        errno = error;
        posix_spawn_file_actions_destroy(&file_action_obj);
        return -1;
}

The close() and dup2() actions now get replaced by corresponding file_actions syscalls, they are used to specify a series of actions to be performed by a posix_spawn operation.

In popen and popenve our code has been reduced to just the 'pid == -1' branch, everything else happens in pdes_child() now.

After readlockenv we call pdes_child and pass it the command to execute in the posix_spawn'd child process; if pdes_child returns -1 we run the old error handling code. Likewise for popenve.

popen, old:

FILE *
popen(const char *cmd, const char *type)
{
        struct pid *cur;
        int pdes[2], serrno;
        pid_t pid;

        _DIAGASSERT(cmd != NULL);
        _DIAGASSERT(type != NULL);

        if ((cur = pdes_get(pdes, &type)) == NULL)
                return NULL;

        MUTEX_LOCK();
        (void)__readlockenv();
        switch (pid = vfork()) {
        case -1:                        /* Error. */
                serrno = errno;
                (void)__unlockenv();
                MUTEX_UNLOCK();
                pdes_error(pdes, cur);
                errno = serrno;
                return NULL;
                /* NOTREACHED */
        case 0:                         /* Child. */
                pdes_child(pdes, type);
                execl(_PATH_BSHELL, "sh", "-c", cmd, NULL);
                _exit(127);
                /* NOTREACHED */
        }
        (void)__unlockenv();

        pdes_parent(pdes, cur, pid, type);

        MUTEX_UNLOCK();

        return cur->fp;
}

popen, new:

FILE *
popen(const char *cmd, const char *type)
{
        struct pid *cur;
        int pdes[2], serrno;
        pid_t pid;

        _DIAGASSERT(cmd != NULL);
        _DIAGASSERT(type != NULL);

        if ((cur = pdes_get(pdes, &type)) == NULL)
                return NULL;

        MUTEX_LOCK();
        (void)__readlockenv();
        pid = pdes_child(pdes, type, cmd);
        if (pid == -1) {
                /* Error. */
                serrno = errno;
                (void)__unlockenv();
                MUTEX_UNLOCK();
                pdes_error(pdes, cur);
                errno = serrno;
                return NULL;
                /* NOTREACHED */
        }
        (void)__unlockenv();

        pdes_parent(pdes, cur, pid, type);

        MUTEX_UNLOCK();

        return cur->fp;
}

popenve, old:

FILE *
popenve(const char *cmd, char *const *argv, char *const *envp, const char *type)
{
        struct pid *cur;
        int pdes[2], serrno;
        pid_t pid;

        _DIAGASSERT(cmd != NULL);
        _DIAGASSERT(type != NULL);

        if ((cur = pdes_get(pdes, &type)) == NULL)
                return NULL;

        MUTEX_LOCK();
        switch (pid = vfork()) {
        case -1:                        /* Error. */
                serrno = errno;
                MUTEX_UNLOCK();
                pdes_error(pdes, cur);
                errno = serrno;
                return NULL;
                /* NOTREACHED */
        case 0:                         /* Child. */
                pdes_child(pdes, type);
                execve(cmd, argv, envp);
                _exit(127);
                /* NOTREACHED */
        }

        pdes_parent(pdes, cur, pid, type);

        MUTEX_UNLOCK();

        return cur->fp;
}

popenve, new:

FILE *
popenve(const char *cmd, char *const *argv, char *const *envp, const char *type)
{
        struct pid *cur;
        int pdes[2], serrno;
        pid_t pid;

        _DIAGASSERT(cmd != NULL);
        _DIAGASSERT(type != NULL);

        if ((cur = pdes_get(pdes, &type)) == NULL)
                return NULL;

        MUTEX_LOCK();
        pid = pdes_child(pdes, type, cmd);
        if (pid == -1) {
                /* Error. */
                serrno = errno;
                MUTEX_UNLOCK();
                pdes_error(pdes, cur);
                errno = serrno;
                return NULL;
                /* NOTREACHED */
        }

        pdes_parent(pdes, cur, pid, type);

        MUTEX_UNLOCK();

        return cur->fp;
}

GSoC Reports: Fuzzing Rumpkernel Syscalls, Part 1

2020-07-13T13:08:58+00:00

This report was prepared by Aditya Vardhan Padala as a part of Google Summer of Code 2020

It has been a great opportunity to contribute to NetBSD as a part of Google Summer Of Code '20. The aim of the project I am working on is to setup a proper environment to fuzz the rumpkernel syscalls. This is the first report on the progress made so far.

Rumpkernels provide all the necessary components to run applications on baremetal without the necessity of an operating system. Simply put it is way to run kernel code in user space.

The main goal of rumpkernels in netbsd is to run,debug,examine and develop kernel drivers as easy as possible in the user space without having to run the entire kernel but run the exact same kernel code in userspace. This makes most of the components(drivers) easily portable to different environments.

Rump Kernels are constructed out of components, So the drivers are built as libraries and these libraries are linked to an interface(some application) that makes use of the libraries(drivers). So we need not build the entire monolithic kernel just the required parts of the kernel.

Why Honggfuzz?

I considered Honggfuzz the best place to start with for fuzzing the syscall layer as suggested by my mentors. LibFuzzer style library fuzzing method helped me in exploring how syscalls are implemented in the rumpkernel. With LibFuzzer we have the flexibility of modifying a few in-kernel functions as per the requirements to best suit the fuzzing target.

Fuzzing target

Taking a close look at src/sys/rump/librump/rumpkern/rump_syscalls.c we observe that this is where the rump syscalls are defined. These functions are responsible for creating the arguments structure (like wrappers) and passing it to syscalls which is define

rsys_syscall(num, data, dlen, retval)

which is defined from

rump_syscall(num, data, dlen, retval)

This function is the one that invokes the execution of the syscalls. So this should be the target for fuzzing syscalls.

Fuzzing Using Honggfuzz

Initially we used the classic LibFuzzer style.


int
LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) 
{
    if(Size != 2 + 8+sizeof(uint8_t))
        return 0;

    ExecuteSyscallusingtheData();
    return 0;
}

However this approach into issues when we had to overload copyin(), copyout(), copyinstr(), copyoutstr() functions as the pointers that is used in these functions are from the Data buffer that the fuzzer provides for each fuzz iteration.

int
copyin(const void *uaddr, void *kaddr, size_t len)
{
..
..
..
..
     if (RUMP_LOCALPROC_P(curproc)) {
         memcpy(kaddr, uaddr, len); <- slow
     } else if (len) {
         error = rump_sysproxy_copyin(RUMP_SPVM2CTL(curproc->p_vmspace),
             uaddr, kaddr, len);
     }
     return error;
 }

Honggfuzz provides us HF_ITER interface which will be useful to actively fetch inputs from the fuzzer for example.

extern void HF_ITER(uint8_t **buf, size_t *len);

int
main()
{
  for (;;) {
    uint8_t *buf;
    size_t len;
    HF_ITER(&buf, &len);
    DoSomethingWithInput(buf, len);
  }
  return 0;
}

So we switched to a faster method of using HF_ITER() in honggfuzz to fetch the data from the fuzzer as it is relatively flexible to use.

EXTERN int
rumpns_copyin(const void *uaddr, void *kaddr, size_t len)
{
        int error = 0;
        if (len == 0)
                return 0;
        //HF_MEMGET() is a wrapper around HF_ITER()
        HF_MEMGET(kaddr, len);
        return error;
}

Similar overloading is done for copyout(), copyinstr(), copyoutstr().

The current efforts to fuzz the rump syscalls using honggfuzz can be found here.

This gave quite a speed bump to the "dumb" fuzzer from few tens iterations to couple of hundreds as we replaced the memcpys with a wrapper around HF_ITER().

Further work by Kamil Rytarowski

Kamil has detected that overloading copyin()/copyout() functions have a shortcoming that we are mangling internal functionality of the rump that uses this copying mechanism, especially in rump_init(), but also in other rump wrappers, e.g. opening a file with rump_sys_open().

The decision has been made to alter the fuzzing mechanism from pumping random honggfuzz assisted data into the rump-kernel APIs and intercept copyin()/copyount() family of functions to add prior knowledge of the arguments that are valid. The initial target set by Kamil was to reproduce the following rump kernel crash (first detected with syzkaller in the real kernel):


#include <sys/types.h>
#include <sys/ioctl.h>

#include <rump/rump.h>
#include <rump/rump_syscalls.h>

int
main(int argc, char **argv)
{
        int filedes[2];

        rump_init();
        rump_sys_pipe2(filedes, 0);
        rump_sys_dup2(filedes[1], filedes[0]);
        rump_sys_ioctl(filedes[1], FIONWRITE);
}

https://www.netbsd.org/~kamil/panic/rump_panic.c

We can compare that panicking the real kernel was analogous to the rump calls:

#include <sys/types.h>
#include <sys/ioctl.h>

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <err.h>
#include <fcntl.h>


int
main(int argc, char **argv)
{
    int filedes[2];
    pipe2(filedes, 0);
    dup2(filedes[1], filedes[0]);
    ioctl(filedes[1], FIONWRITE);

    return 0;
}

https://www.netbsd.org/~kamil/panic/panic.c

Thus, Kamil wrote a new fuzzing program that uses pip2, dup2 and ioctl APIs exclusively and has prior knowledge about valid arguments to these functions (as the reproducer is using perfectly valid NetBSD syscalls and arguments):

https://github.com/adityavardhanpadala/rumpsyscallfuzz/blob/master/honggfuzz/ioctl/ioctl_fuzz2.c

The code instead of checking out impossible (non-existent) kernel code paths, it ensures that e.g. ioctl(2) operations are always existing ioctl(2) operations.

static unsigned long
get_ioctl_request(void)
{
    unsigned long u;

    // ioctlprint -l -f '\t\t%x /* %n */,\n'
    static const unsigned long vals[] = {
        0x8014447e /* DRVRESUMEDEV */,
        0xc0784802 /* HPCFBIO_GDSPCONF */,
        /* A lot of skipped lines here... */
        0xc0186b01 /* KFILTER_BYNAME */,
        0x80047476 /* TIOCSPGRP */,
    };

    u = get_ulong() % __arraycount(vals);

    return vals[u];
}

Kamil also rewrote the HF_MEMGET() function, so instead of recharging the buffer from the fuzzer, whenever the buffer expired the fuzzed program terminates resetting its state and checks another honggfuzz input. This intends to make the fuzzing process more predictable in terms of getting good reproducers. So far we were unable to generate good reproducers and we are still working on it.

Unfortunately (or fortunately) the code in ioctl_fuzz2.c triggers another (rump)kernel bug, unrelated to the reproducer in rump_panic.c. Furthermore, the crash looks like a race condition that breaks randomly, sometimes and honggfuzz doesn't generate good reproducers for it.

The obvious solution to this is to run the fuzzing process with TSan involved and catch such bugs quickly, unfortunately reaching MKSANITIZER for TSan + rumpkernel is still unfinished and beyond the opportunities during this GSoC.

Obstacles

The fuzzer is still dumb meaning that we are still using just random data as arguments to the fuzzer so the coverage did not show that much improvement b/w 13 mins and 13 hours of fuzzing.
Crash reproducers get sliced due to the way we are fetching input from fuzzer using HF_ITER() and as functions like copyin() and copyout() requesting quite large buffers from the fuzzers for some non-trivial functions like rump_init().

To-Do

Improving the crash reproduction mechanism.
Making the fuzzer smart by using grammar so that the arguments to syscalls are syntactically valid.
Finding an optimal way to fetch data from fuzzer so that the reproducers are not sliced.

If the above improvements are done we get more coverage and if we are lucky enough lot more crashes.

Finally I'd like to thank my mentors Siddharth Muralee, Maciej Grochowski, Christos Zoulas for their guidance and Kamil Rytarowski for his constant support throughout the coding period.

The GNU GDB Debugger and NetBSD (Part 1)

2020-04-02T18:49:54+00:00

The NetBSD team of developers maintains two copies of GDB:

One in the base-system with a stack of local patches.
One in pkgsrc with mostly build fix patches.

The process of maintaining a modern version (GPLv3) of GDB in basesystem is tainted with a constant extra cost. The NetBSD developers need to rebase the stack of local patches for the newer releases of the debugger and resurrect the support. The GDB project is under an active development and in active refactoring of the code, that was originally written in C, to C++.

Unfortunately we cannot abandon the local basesystem patches and rely on a pristine version as there is lack of feature parity in the pkgsrc version of GDB: no threading support, not operational support for most targets, no fork/vfork/etc events support, no auxv reading support on 64-bit kernels, no proper support of signals, single step etc.

Additionally there are extra GDB patches stored in pkgsrc-wip (created by me last year), that implement the gdbserver support for NetBSD/amd64. gdbserver is a GDB version that makes it possible to remotely debug other programs even across different Operating Systems and CPUs. This code has still not been merged into the mainline base-system version. This month, I have discovered that support needs to be reworked, as the preexisting source code directory hierarchy was rearranged.

Unless otherwise specified all the following changes were upstreamed to the mainstream GDB repository. According to the GDB schedule, the GDB10 branch point is planned on 2020-05-15 with release on 2020-06-05. It's a challenge to see how much the GDB support can be improved by then for NetBSD!

PSIM

The GDB debugger contains PSIM (Model of the PowerPC Architecture) originally developed by Andrew Cagney between 1994 and 1996. This is a simulator that contains, among other things, NetBSD support in the UEA mode. This means that GDB can run static programs prebuilt for NetBSD without execution on a real PowerPC hardware. In order to make it work, there is need to wrap the kernel interfaces such as syscalls, errno values and signals and handle them in the simulator.

I have updated the list of errno names and signal names with NetBSD 9.99.49.

It would be nice to still update the list of syscalls to reflect the current kernels, but I have deferred this into future.

bfd changes

The AArch64 (NetBSD/evbarm) target uses PT_GETREGS and PT_GETFPREGS operation names with the same Machine Dependent values as NetBSD/alpha and NetBSD/sparc. This knowledge is required as these values are used in core(5) files, as emitted by a crashing program. I've added a patch that recognizes these ELF notes in arm64 coredumps appropriately.

I've also added a new define constant NT_NETBSDCORE_AUXV. This allows properly identifying AUXV ELF notes in core files. Meanwhile I have implemented and added detection of LWPSTATUS notes. This note ships with meta information (name, signal context, TLS base, etc) about threads in a process in a core.

The number of ARM and MIPS boards supported by NetBSD is huge and there are multiple variations of them. I have fixed the detection macro in bfd to recognize more arm and mips NetBSD installations.

GDB/NetBSD fixes in CPU specific files

I have reached the state of GDB being more operational for more NetBSD ports out of the box. There were missing features and build issues that has been addressed. I have committed the following changes:

Now support for NetBSD in various CPU-specific files improved significantly, however there are still missing features, especially KGDB debugging and unwinding the stack over the signal trampoline. There are still smaller or larger changes that might be needed on per-port basis and I will keep working on them. There is need to develop at least proper aarch64 support as it is missing upstream. We might evaluate what to do with at least Itanium and RISCV.

CPU Generic improvements in the GDB codebase

I've switched the nbsd_nat_target::pid_to_exec_file() function from a logic of reading the /proc entries to a sysctl(3) based solution.

As the gdbserver support is around the corner, I have improved small parts of the code base to be compatibile with NetBSD. I've fixed the unconditional inclusion of alloca.h in gdbsupport. Another fix namespaced a local class reg, because it conflicted with the struct reg from the NetBSD headers.

The current logic of get_ptrace_pid function matches the semantics of other kernels suchs as Linux and FreeBSD. With the guidance of upstream developers, I have disabled this function completely for NetBSD instead of patching it for the NetBSD specific behavior of maintaining pairs PID+LWP for each internal ptid_t entry (that reflects the relation of PID, LWP and TID).

Plan for the next milestone

Finish reimplementing operational support of debugging of multi-threaded programs and upstream more patches, especially CPU-independent ones.

Accomplishment of porting ptrace(2) test scenarios

2020-03-10T17:21:32+00:00

This month I have finished porting ptrace(2) tests from other Operating Systems. I have determined which test scenarios were missing, compared to FreeBSD and Linux, and integrated them into the ATF framework. I have skipped some of the tests as the interesting behavior was already covered in existing tests (sometimes indirectly) or tools (like picotrace), or the NetBSD kernel exhibits different behavior.

As my work is reaching the end, I was trying to clean up the state with other projects.

ptrace(2) ATF tests

I have determined which test scenarios were missing and integrated them. Certain tests like wrapping FreeBSD specific pdfork(2) call were omitted as not applicable.

There are few new tests that are marked as expected failure for corner cases that are scheduled for fixing in future.

I have also worked on SIGCHLD-based debugging and analysis of its behavior. I have found out that SA_NOCLDWAIT behaves suspiciously. This flag passed to sigaction(2) is an extension. If set, the system will not create a zombie when the child exits, but the child process will be automatically waited for. The same effect can be achieved by setting the signal handler for SIGCHLD to SIG_IGN. Currently it behaves differently under a debugger as the child process is never collected and is waiting for parent to collect it. According to my research this behavior is unexpected. A potential fix might not be difficult in the kernel, but due to time constraints I have decided to add an ATF tests for this scenario, mark it as failed and include a comment deferring this case into future.

I have also refactored the remaining threaded tests, switching them from low-level LWP API to pthread(3) one.

Other changes

I was working on finishing projects that were left behind.

GDB and qemu upstreaming

I'm working on upstreaming NVMM support to mainline QEMU. This process is still ongoing.

I am slowly reducing the patchset against the GDB repository.

jemalloc changes

The jemalloc allocator is a general purpose malloc(3) implementation that emphasizes fragmentation avoidance and scalable concurrency support. It's the default allocator in the NetBSD Operating System since 2007.

There are a few workarounds that make jemalloc compatible with NetBSD internals and I was trying to remove them. Unfortunately, the allocator tries to initialize itself too early using a C++-like constructor and intercepts the first malloc(3). The is done before initializing libpthread, and the pthread startup code uses malloc() when registering pthread_atfork(3) callbacks. In order to make it work, we allow premature usage of the libpthread functionality. I was trying to correct this, but I've introduced slight regressions in corner cases. They are hard to debug as the allocator is corrupted internally and randomly misbehaves (hangs, occasional crashes). I've discussed with the upstream developers about addressing this properly, but as reproducing the setup needs familiarity with the process of development NetBSD, we are still working on it.

Meanwhile, I have managed to correct known Undefined Behavior issues in jemalloc and address all known issues working together with upstream.

syzkaller

I received write access to the syzkaller GitHub repository. I also helped to get Kernel MSan (unauthorized memory access) operational on the syzbot node.

Miscellaneous changes

I helped with the libc++ upgrade that was done by Michal Gorny (but still not merged into mainline). As part of this work we gained a support for errno codes for POSIX robust mutexes.

I have implemented missing DT_GNU_HASH support as specified by GNU and LLVM linkers. This code was based on the implementation from three other major BSDs.

The micro-UBSan implementation gained support for alignment_assumptions. A number of UBSan reports were addressed.

Plan for the next and the last milestone

Upstream gdbserver support and address as many remaining bugs as the time will permit.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

http://netbsd.org/donations/#how-to-donate

Fundraising 2020

2020-02-13T13:37:33+00:00

Is it really more than 10 years since we last had an official fundraising drive?

Looking at old TNF financial reports I noticed that we have been doing quite well financially over the last years, with a steady stream of small and medium donations, and most of the time only moderate expenditures. The last fundraising drive back in 2009 was a giant success, and we have lived off it until now.

In the last two or three years the core team was able to find developers doing various tasks of funded development. Not all of them ended in a full success and were integrated into the main source tree — like the WiFi IEEE 802.11 rework, which still needs to be finished — but others pushed the project forward in big steps (like Support for "Arm ServerReady" compliant machines (SBBR+SBSA) debuting with the new aarch64 architecture in NetBSD 9.0).

There is more room for improvements, and not always volunteer time available, so funding some critical parts of development makes NetBSD better faster.

Besides the big development contracts we often buy hardware for developers working on special machines, and we also invest in our server infrastructure.

But now it is time: we would like to officially ask for donations this year. We are trying to raise $50,000 in 2020, to support ongoing development and new upcoming contracts - helping to make NetBSD 10 happen this year and be the best NetBSD ever!

The NetBSD Foundation is a non-profit organization as per section 501(c)(3) of the Internal Revenue Code. If you are a US company or citizen, your donations may be tax deductible. Your donations may also be eligible for matching offers from your employer.

Approaching the end of work on ptrace(2)

2020-02-11T17:08:08+00:00

This is one of my last reports on enhancements on ptrace(2) and the surrounding code. This month I complete a set of older pending tasks.

dlinfo(3) - information about a dynamically loaded object

I've documented dlinfo(3) and added missing cross-reference from the dlfcn(3) API. dlinfo(3) is a Solaris-style and first appeared in NetBSD 5.1. Today this API is also implemented in Linux and FreeBSD, making it the right portable tool for certain operations. Today we support only the RTLD_DI_LINKMAP operation out of few functionalities. RTLD_DI_LINKMAP translates the dlopen(3) opaque handler to struct link_map.

This is the exact functionality needed in the LLVM sanitizers. So far we have been using manual extraction of the struct's field with offset over an opaque pointer. Unfortunately the existing algorithm was prone to internal modifications of the ELF loader and it used to break few times a year.

#define _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, shift) \
  ((link_map *)((handle) == nullptr ? nullptr : ((char *)(handle) + (shift))))

#if defined(__x86_64__)
#define GET_LINK_MAP_BY_DLOPEN_HANDLE(handle) \
  _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, 264)
#elif defined(__i386__)
#define GET_LINK_MAP_BY_DLOPEN_HANDLE(handle) \
  _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, 136)
#endif

I was pondering how to implement a dedicated interface until I found this interface in rumpkernels! Antti Kantee, a NetBSD developer and author of rumpkernels, implemented this feature but left it undocumented. I quickly scanned this interface, picked the man page from FreeBSD, adapted for NetBSD and pushed to LLVM. The FreeBSD Operating system support in sanitizers also suffered from the offset struct changes and with collaboration with me from the NetBSD side, we switched the appropriate code to a dynamic value. This also improves compatibility with all NetBSD ports that could potentially support the LLVM sanitizers, as previously the offsets were calculated only for x86 platforms (i386, amd64).

`ktruss(1)` enhancements

I've switched ktruss(1) to ptrace descriptive operation names. Previously ptrace(2) requests were encoded with a magic value like 0x10 or 0x14. This style of encoding fields is not human-friendly and required checking appropriate system headers. Now the output looks like this (reduced to ptrace syscalls only):

  3902      1 t_ptrace_waitpid ptrace(PT_ATTACH, 0xfd7, 0, 0) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0xfd7, 0x7f7fff51a4f0, 0x88) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_CONTINUE, 0xfd7, 0x1, 0) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0xfd7, 0x7f7fff51a4f0, 0x88) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_CONTINUE, 0xfd7, 0x1, 0x9) = 0
  5449      1 t_ptrace_waitpid ptrace(PT_TRACE_ME, 0, 0, 0) = 0
  6086      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0x1549, 0x7f7fffa7c230, 0x88) = 0
  6086      1 t_ptrace_waitpid ptrace(PT_IO, 0x1549, 0x7f7fffa7c210, 0x20) = 0

libc threading stubs improvements

I have adjusted the error return value of pthread_sigmask(3) whenever libpthread is either not loaded or initialized. Now instead of returning -1, return errno on error. This caught up after the fix in libpthread by Andrew Doran in 2008 in lib/libpthread/pthread_misc.c r.1.9. It remains an open question whether this function shall be used without linked in the POSIX thread library.

This incompatibility was detected by Bruno Haible (GNU) and documented in gnulib in commit pthread_sigmask: Avoid test failure on NetBSD 8.0. r. 4d16a83b0c1fcb6c.

Compatibility fixes in `compat_netbsd32(8)`

There was a longstanding issue with incompatibility between few syscalls in the native NetBSD ABI and 32-bit compat one. I've addressed this with the following change:

commit 2e2cec309dca0021e364d52597388665c500d66e
Author: kamil 
Date:   Sat Jan 18 07:33:24 2020 +0000

    Catch up after getpid/getgid/getuid changes in native ABI in 2008
    
    getpid(), getuid() and getgid() used to call respectively sys_getpid(),
    sys_getuid() and sys_getgid(). In the BSD4.3 compat mode there was a
    fallback to call sys_getpid_with_ppid() and related functions.
    
    In 2008 the compat ifdef was removed in sys/kern/syscalls.master r. 1.216.
    
    For purity reasons we probably shall restore the NetBSD original behavior
    and implement BSD4.3 one as a compat module, however it is not worth the
    complexity.
    
    Align the netbsd32 compat ABI to native ABI and call functions that return
    two integers as in BSD4.3.

New ptrace(2) ATF tests (OpenBSD)

I have finished importing the OpenBSD test (regress/sys/ptrace/ptrace.c) for unaligned program counter register. For a long time it was cryptic what the original intention was to test, whether it was some concept of what values should be rejected in the API for purity reasons, whether we were testing SIGILL, whether there was a kernel problem or something else.

Finally, I reached the original committer (Miod Vallat) and he pointed out to me that there was a sparc/OpenBSD bug that could break the kernel.

https://marc.info/?l=openbsd-bugs&m=107558043319084&w=2

Instead of hardcoding magic program counter register on a per-cpu basis, I added tests using 3 types of them for all CPUs. At the end of the day these tests crashed the NetBSD kernel for at least a single port.

New `ptrace(2)` ATF tests (FreeBSD)

Scanning the ptrace(2) scenarios in FreeBSD, I decided to expand the ATF framework for unrelated tracer variation (as opposed to real parent = debugger) for tests: fork1-16, vfork1-16, posix_spawn1-16. New tests were also introduced for: posix_spawn_detach_spawner, fork_detach_forker, vfork_detach_vforkerdone, posix_spawn_kill_spawner, fork_kill_forker, vfork_kill_vforker, vfork_kill_vforkerdone.

`libpthread(3)` enhancements

I have added missing sanity checks in the libpthread(3) API functions. This means that now on entry to calls like pthread_mutex_lock(3) we will first check whether the passed object is a valid and alive mutex. I have retired conditional (preprocessor variable) checks in API families (rwlock, spin-lock) as they were always enabled. This behavior matches the Linux implementation and is assumed in LLVM sanitizers. I also found out that these semantics are expected by third-party code, especially WolfSSL.

Our current sanity checking version of pthread_equal(3) found at least one abuse of the interface in the NSPR package and its downstream users: Firefox, Thunderbird and Seamonkey. For the time being a workaround has been applied and the bug is scheduled for proper investigation and fix.

While working on this. I found that the jemalloc (system's malloc) initialization is misused and we initialize a jemalloc's mutex with uninitialized data. A workaround has been applied by Christos Zoulas but we are still looking for proper initialization model of our libc, libpthread and malloc code. There is a problem with mutual dependencies between the components and we are looking for a clean solution.

`env(1)` and `putenv(3)`

The NetBSD project maintains a LLVM buildbot in the LLVM buildfarm. We build and execute tests for virtually every LLVM project which we find important (llvm, clang, lldb, lld, compiler-rt, polly, openmp, libc++, libc++abi etc.) A frequent issue is that LLVM developers use constructs that are portable only to a selection of OSs that are worth supporting in the tests. One such incompatibility introduced from time to time is env(1) with the -u option. This option is an extension to POSIX, but supported by FreeBSD, Linux and Darwin. It unsets a variable in the environment.

It was a constant cost to see this regressing over and over again on the NetBSD buildbot and I decided to just implement it natively. While there, I have implemented another popular option: -0. This switch ends each output line with NUL, not newline.

While investigating the env(1) differences between Operating Systems, I have learnt that putenv(3) changed its behavior. This function call originally allocated its content internally, however it was changed in future versions as requested by POSIX to not allocate memory. I have audited the base system to see whether we obey these semantics everywhere and detected a use-after-free bug in the PAM (Pluggable Authentication Modules framework)! I have fixed the problem with a potential security implication.

commit 5c19ff198d69f48222ab4907235addbfa60d4e2a
Author: kamil 
Date:   Sat Feb 8 13:44:35 2020 +0000

    Avoid use-after-free bug in PAM environment
    
    Traditional BSD putenv(3) was creating an internal copy of the passed
    argument. Unfortunately this was causing memory leaks and was changed by
    POSIX to not allocate.
    
    Adapt the putenv(3) usage to modern POSIX (and NetBSD) semantics.

diff --git a/usr.bin/login/login_pam.c b/usr.bin/login/login_pam.c
index 66303006b514..7d877ebeb1c0 100644
--- a/usr.bin/login/login_pam.c
+++ b/usr.bin/login/login_pam.c
@@ -1,6 +1,6 @@
-/*     $NetBSD: login_pam.c,v 1.25 2015/10/29 11:31:52 shm Exp $       */
+/*     $NetBSD: login_pam.c,v 1.26 2020/02/08 13:44:35 kamil Exp $       */
 
 /*-
  * Copyright (c) 1980, 1987, 1988, 1991, 1993, 1994
  *	The Regents of the University of California.  All rights reserved.
  *
@@ -37,11 +37,11 @@ __COPYRIGHT("@(#) Copyright (c) 1980, 1987, 1988, 1991, 1993, 1994\
 
 #ifndef lint
 #if 0
 static char sccsid[] = "@(#)login.c	8.4 (Berkeley) 4/2/94";
 #endif
-__RCSID("$NetBSD: login_pam.c,v 1.25 2015/10/29 11:31:52 shm Exp $");
+__RCSID("$NetBSD: login_pam.c,v 1.26 2020/02/08 13:44:35 kamil Exp $");
 #endif /* not lint */
 
 /*
  * login [ name ]
  * login -h hostname	(for telnetd, etc.)
@@ -600,12 +600,12 @@ skip_auth:
 	 */
 	if ((pamenv = pam_getenvlist(pamh)) != NULL) {
 		char **envitem;
 
 		for (envitem = pamenv; *envitem; envitem++) {
-			putenv(*envitem);
-			free(*envitem);
+			if (putenv(*envitem) == -1)
+				free(*envitem);
 		}
 
 		free(pamenv);
 	}

Other changes

I have fixed LLVM sanitizers after recent basesystem changes and they are again functional. There was a need to adapt the code for urio(4) device driver removal (USB driver for the Diamond Multimedia Rio500 MP3 player). With the clang version 9 we were also required to adapt paths for installation of the sanitizers.

I have fixed a race in the ptrace(2) ATF test resume1. There was a bug when two events were triggered concurrently from two competing threads. This bug appeared after recent changes by Andrew Doran who refactored the internals and so the race was now more frequent than before.

I am working with students and volunteers who are learning how to use sanitizers in the NetBSD context. Over the past month we fixed MKLIBCSANITIZER build with recent GCC 8.x and addressed a number of Undefined Behavior reports in the system.

I have helped to upstream chunks of the NetBSD code to GDB and GCC. I'm working on upstreaming NVMM support to qemu. The patchset is still in review waiting for more feedback before the final merge.

I have imported realpath(1) utility from FreeBSD as it is used sometimes by existing scripts (even if there are more portable alternatives).

Plan for the next milestone

Port remaining ptrace(2) test scenarios from Linux and FreeBSD to ATF and ensure that they are properly operational.

This work was sponsored by The NetBSD Foundation.

http://netbsd.org/donations/#how-to-donate

Improving the ptrace(2) API and preparing for LLVM-10.0

2020-01-13T20:03:02+00:00

This month I have improved the NetBSD ptrace(2) API, removing one legacy interface with a few flaws and replacing it with two new calls with new features, and removing technical debt.

As LLVM 10.0 is branching now soon (Jan 15th 2020), I worked on proper support of the LLVM features for NetBSD 9.0 (today RC1) and NetBSD HEAD (future 10.0).

ptrace(2) API changes

There are around 20 Machine Independent ptrace(2) calls. The origin of some of these calls trace back to BSD4.3. The PT_LWPINFO call was introduced in 2003 and was loosely inspired by a similar interface in HP-UX ttrace(2). As that was the early in the history of POSIX threads and SMP support, not every bit of the interface remained ideal for the current computing needs.

The PT_LWPINFO call was originally intended to retrieve the thread (LWP) information inside a traced process.

This call was designed to work as an iterator over threads to retrieve the LWP id + event information. The event information is received in a raw format (PL_EVENT_NONE, PL_EVENT_SIGNAL, PL_EVENT_SUSPENDED).

Problems:

1. PT_LWPINFO shares the operation name with PT_LWPINFO from FreeBSD that works differently and is used for different purposes:

On FreeBSD PT_LWPINFO returns pieces of information for the suspended thread, not the next thread in the iteration.
FreeBSD uses a custom interface for iterating over threads (actually retrieving the threads is done with PT_GETNUMLWPS + PT_GETLWPLIST).
There is almost no overlapping correct usage of PT_LWPINFO on NetBSD and PL_LWPINFO on FreeBSD, and this causes confusion and misuse of the interfaces (recently I fixed such misuse in the DTrace code).

2. pl_event can only return whether a signal was emitted to all threads or a single one. There is no information whether this is a per-LWP signal or per-PROC signal, no siginfo_t information is attached etc.

3. Syncing our behavior with FreeBSD would mean complete breakage of our PT_LWPINFO users and it is actually unnecessary, as we receive full siginfo_t through Linux-like PT_GET_SIGINFO, instead of reimplementing siginfo_t inside ptrace_lwpinfo in FreeBSD-style. (FreeBSD wanted to follow NetBSD and adopt some of our APIs in ptrace(2) and signals.).

4. Our PT_LWPINFO is unable to list LWP ids in a traced process.

5. The PT_LWPINFO semantics cannot be used in core files as-is (as our PT_LPWINFO returns next LWP, not the indicated one) and pl_event is redundant with netbsd_elfcore_procinfo.cpi_siglwp, and still less powerful (as it cannot distinguish between a per-LWP and a per-PROC signal in a single-threaded application).

6. PT_LWPINFO is already documented in the BUGS section of ptrace(2), as it contains additional flaws.

Solution:

1. Remove PT_LWPINFO from the public ptrace(2) API, keeping it only as a hidden namespaced symbol for legacy compatibility.

2. Introduce the PT_LWPSTATUS that prompts the kernel about exact thread and retrieves useful information about LWP.

3. Introduce PT_LWPNEXT with the iteration semantics from PT_LWPINFO, namely return the next LWP.

4. Include per-LWP information in core(5) files as "PT_LWPSTATUS@nnn".

5. Fix flattening the signal context in netbsd_elfcore_procinfo in core(5) files, and move per-LWP signal information to the per-LWP structure "PT_LWPSTATUS@nnn".

6. Do not bother with FreeBSD like PT_GETNUMLWPS + PT_GETLWPLIST calls, as this is a micro-optimization. We intend to retrieve the list of threads once on attach/exec and later trace them through the LWP events (PTRACE_LWP_CREATE, PTRACE_LWP_EXIT). It's more important to keep compatibility with current usage of PT_LWPINFO.

7. Keep the existing ATF tests for PT_LWPINFO to avoid rot.

PT_LWPSTATUS and PT_LWPNEXT operate over newly introduced "struct ptrace_lwpstatus". This structure is inspired by: - SmartOS lwpstatus_t, - struct ptrace_lwpinfo from NetBSD, - struct ptrace_lwpinfo from FreeBSD

and their usage in real existing open-source software.

#define PL_LNAMELEN 20 /* extra 4 for alignment */

struct ptrace_lwpstatus {
 lwpid_t  pl_lwpid;  /* LWP described */
 sigset_t pl_sigpend;  /* LWP signals pending */
 sigset_t pl_sigmask;  /* LWP signal mask */
 char  pl_name[PL_LNAMELEN]; /* LWP name, may be empty */
 void  *pl_private;  /* LWP private data */
 /* Add fields at the end */
};

pt_lwpid is picked from PT_LWPINFO.
pl_event is removed entirely as useless, misleading and harmful.
pl_sigpend and pl_sigmask are mainly intended to untangle the cpi_sig* fields from "struct ptrace_lwpstatus" (fix "XXX" in the kernel code).
pl_name is an easy to use API to retrieve the LWP name, replacing sysctl() retrieval. (Previous algorithm: retrieve the number of LWPs, retrieve all LWPs; iterate over them; finding the matching ID; copy the LWP name.) pl_name will also be included with the missing LWP name information in core(5) files.
pl_private implements currently missing interface to read the TLS base value.

I have decided to avoid a writable version of PT_LWPSTATUS that rewrites signals, name, or private pointer. These options are practically unused in existing open-source software. There are two exceptions that I am familiar with, but both are specific to kludges overusing ptrace(2). If these operations are needed, they can be implemented without a writable version of PT_LWPSTATUS, patching tracee's code.

I have switched GDB (in base), LLDB, picotrace and sanitizers to the new API. As NetBSD 9.0 is nearing release, this API change will land NetBSD 10.0 and existing ptrace(2) software will use PT_LWPINFO for now.

New interfaces are ensured to be stable and continuously verified by the ATF infrastructure.

pthreadtracer

In the early in the history of libpthread, the NetBSD developers designed and programmed a libpthread_dbg library. It's use-case was initially intended to handle user-space scheduling of threads in the M:N threading model inspired by Solaris.

After the switch of the internals to new SMP design (1:1 model) by Andrew Doran, this library lost its purpose and was no longer used (except being linked for some time in a local base system GDB version). I removed the libpthread_dbg when I modernized the ptrace(2) API, as it no longer had any use (and it was broken in several ways for years without being noticed).

As I have introduced the PT_LWPSTATUS call, I have decided to verify this interface in a fancy way. I have mapped ptrace_lwpstatus::pl_private into the tls_base structure as it is defined in the sys/tls.h header:

struct tls_tcb {   
#ifdef __HAVE_TLS_VARIANT_I
        void    **tcb_dtv;
        void    *tcb_pthread;
#else
        void    *tcb_self;
        void    **tcb_dtv;
        void    *tcb_pthread;
#endif
};

The pl_private pointer is in fact a pointer to a structure in debugger's address space, pointing to a tls_tcl structure. This is not true universally in every environment, but it is true in regular programs using the ELF loader and the libpthread library. Now, with the tcb_pthread field we can reference a regular C-style pthread_t object. Now, wrapping it into a real tracer, I have implemented a program that can either start a debuggee or attach to a process and on demand (as a SIGINFO handler, usually triggered in the BSD environment with ctrl-t) dump the full state of pthread_t objects within a process. A part of the example usage is below:

$ ./pthreadtracer -p `pgrep nslookup` 
[ 21088.9252645] load: 2.83  cmd: pthreadtracer 6404 [wait parked] 0.00u 0.00s 0% 1600k
DTV=0x7f7ff7ee70c8 TCB_PTHREAD=0x7f7ff7e94000
LID=4 NAME='sock-0' TLS_TSD=0x7f7ff7eed890
pt_self = 0x7f7ff7e94000
pt_tls = 0x7f7ff7eed890
pt_magic = 0x11110001 (= PT_MAGIC=0x11110001)
pt_state = 1
pt_lock = 0x0
pt_flags = 0
pt_cancel = 0
pt_errno = 35
pt_stack = {.ss_sp = 0x7f7fef9e0000, ss_size = 4194304, ss_flags = 0}
pt_stack_allocated = YES
pt_guardsize = 65536

Full log is stored here. The source code of this program, on top of picotrace is here.

The problem with this utility is that it requires libpthread sources available and reachable by the build rules. pthreadtracer reaches each field of pthread_t knowing its exact internal structure. This is enough for validation of PT_LWPSTATUS, but is it enough for shipping it to users and finding its real world use-case? Debuggers (GDB, LLDB) using debug information can reach the same data with DWARF, but supporting DWARF in pthreadtracer is currently harder than it ought to be for the interface tests. There is also an option to revive at some point libpthread_dbg(3), revamping it for modern libpthread(3), this would help avoid DWARF introspection and it could find some use in self-introspection programs, but are there any?

LLD

I keep searching for a solution to properly support lld (LLVM linker).

NetBSD's major issue with LLVM lld is the lack of standalone linker support, therefore being a real GNU ld replacement. I was forced to publish a standalone wrapper for lld, called lld-standalone and host it on GitHub for the time being, at least until we will sort out the talks with LLVM developers.

LLVM sanitizers

As the NetBSD code is evolving, there is a need to support multiple kernel versions starting from 9.0 with the LLVM sanitizers. I have introduced the following changes:

[compiler-rt] [netbsd] Switch to syscall for ThreadSelfTlsTcb()
[compiler-rt] [netbsd] Add support for versioned statvfs interceptors
[compiler-rt] Sync NetBSD ioctl definitions with 9.99.26
[compiler-rt] [fuzzer] Include stdarg.h for va_list
[compiler-rt] [fuzzer] Enable LSan in libFuzzer tests on NetBSD
[compiler-rt] Enable SANITIZER_CAN_USE_PREINIT_ARRAY on NetBSD
[compiler-rt] Adapt stop-the-world for ptrace changes in NetBSD-9.99.30
[compiler-rt] Adapt for ptrace(2) changes in NetBSD-9.99.30

The purpose of these changes is as follows:

Stop using internal interface to retrieve the tcl_tcb struct (TLS base) and switch to public API with the syscall _lwp_getprivate(2). While there, I have harmonized the namespacing of __lwp_getprivate_fast() and __lwp_gettcb_fast() in the NetBSD distribution. Now, every port will need to use the same define (-D_RTLD_SOURCE, -D_LIBC_SOURCE or -D__LIBPTHREAD_SOURCE__). Previously these interfaces were conflicting with the public namespaces (affecting kernel builds) and wrongly suggesting that these interfaces might be available to public third party code. Initially I used it in LLVM sanitizers, but switched it to full-syscall _lwp_getspecific().
Nowadays almost every mainstream OS implements support for preinit/initarray/finitarray in all ports, regardless of ABI requirements. NetBSD originally supported these features only when they were mandated by an ABI specification. Christos Zoulas in 2018 enabled these features for all CPUs, and this eventually allowed to enable this feature unconditionally for consumption in the sanitizer code. This allows use of the same interface as Linux or Solaris, rather than relying on C++-style constructors that have their own issues (need to abuse priorities of constructors and lack of guarantee that our code will be called before other constructors, which can be fatal).
Support for kernels between 9.0 and 9.99.30 (and later, unless there are breaking changes).

There is still one portability issue in the sanitizers, as we hard-code the offset of the link_map field within the internal dlopen handle pointer. The dlopen handler is internal to the ELF loader object of type Obj_Entry. This type is not available to third party code and it is not stable. It also has a different layout depending on the CPU architecture. The same problem exists for at least FreeBSD, and to some extent to Linux. I have prepared a patch that utilizes the dlinfo(3) call with option RTLD_DI_LINKMAP. Unfortunately there is a regression with MSan on NetBSD HEAD (it works on 9.0rc1) that makes it harder for me to finalize the patch. I suspect that after the switch to GCC 8, there is now incompatible behavior that causes a recursive call sequence: _Unwind_Backtrace() calling _Unwind_Find_FDE(), calling search_object, and triggering the __interceptor_malloc interceptor again, which calls _Unwind_Backtrace(), resulting in deadlock. The offending code is located in src/external/gpl3/gcc/dist/libgcc/unwind-dw2-fde.c and needs proper investigation. A quick workaround to stop recursive stack unwinding unfortunately did not work, as there is another (related?) problem:

==4629==MemorySanitizer CHECK failed:
/public/llvm-project/llvm/projects/compiler-rt/lib/msan/msan_origin.h:104 "((stack_id)) != (0)" (0x0, 0x0)

This shows that this low-level code is very sensitive to slight changes, and needs maintenance power. We keep improving the coverage of tested scenarios on the LLVM buildbot, and we enabled sanitizer tests on 9.0 NetBSD/amd64; however we could make use of more manpower in order to reach full Linux parity in the toolchain.

Other changes

As my project in LLVM and ptrace(2) is slowly concluding, I'm trying to finalize the related tasks that were left behind.

I've finished researching why we couldn't use syscall restart on kevent(2) call in LLDB and improved the system documentation on it. I have also fixed small nits in the NetBSD wiki page on kevent(2).

I have updated the list of ELF defines for CPUs and OS ABIs in sys/exec_elf.h.

Plan for the next milestone

Port remaining ptrace(2) test scenarios from Linux, FreeBSD and OpenBSD to ATF and ensure that they are properly operational.

This work was sponsored by The NetBSD Foundation.

http://netbsd.org/donations/#how-to-donate

Board of Directors and Officers elected

2019-11-20T21:12:59+00:00

Per the membership voting, we have seated the new Board of Directors of the NetBSD Foundation:

Taylor R. Campbell <riastadh@>
William J. Coldwell <billc@>
Michael van Elst <mlelstv@>
Thomas Klausner <wiz@>
Cherry G. Mathew <cherry@>
Pierre Pronchery <khorben@>
Leonardo Taccari <leot@>

We would like to thank Makoto Fujiwara <mef@> and Jeremy C. Reed <reed@> for their service on the Board of Directors during their term(s).

The new Board of Directors have voted in the executive officers for The NetBSD Foundation:

President:	William J. Coldwell
Vice President:	Pierre Pronchery
Secretary:	Christos Zoulas
Assistant Secretary:	Thomas Klausner
Treasurer:	Christos Zoulas
Assistant Treasurer:	Taylor R. Campbell

Thanks to everyone that voted and we look forward to a great 2020.

Stabilization of the ptrace(2) threads continued

2019-11-04T10:57:53+00:00

I have introduced changes to make debuggers more reliable in threaded scenarios. Additionally I have revamped micro-UBSan runtime for newer Clang (version 10git). I have received the OK from core@ to switch our iconv(3) to POSIX conformant iconv(3) and I have adapted where possible and readily known in pkgsrc to the newer API. This month I continued to find a solution to the impasse in LLD that blocks adding NetBSD support.

Threading support

I have simplified the struct proc and removed a p_oppid field that stored the numeric process id of the original parent (forker). This field is not needed as it duplicates p_opptr (current real parent pointer) that is already safe to use. So far this has not proven to be unsafe.

I have refactored the signal code making it more verbose to reflect the actual needs of the kernel signal code.

I have fixed a nasty bug in the function that is called when a thread returns from the kernel to userland. There was a tiny time window when in certain scenarios a thread was never stopped on process suspension but was instead resumed causing waitpid(2) polling to never return success as the process can be never stopped with a running thread.

There was a race bug that could cause a nested thread termination call, triggering a panic.

With the above changes I was able to reliably run all ATF tests for LWP events (threading events). I have also bumped the threading tests to atually execute 100 concurrent threads, as the higher number can more easily trigger anomalies. In my observations all tests are now rock solid.

There are now no longer any ptrace(2) tests in ATF marked as flaky or disabled. The two main offenders, vfork(2) events and threading events, are now solid.

Michal Gorny detected another source of instability of threads with a LLDB regression test. It was related to emitting a massive number of concurrent threads. I have helped Michal to address this problem and squash the bug.

All of the above changes are now pulled to NetBSD-9 for future 9.0 release.

There are at the time of writing, 4 failing LLDB threading tests and few more related to debug registers. Both failure types are under investigation. They could be bugs in the NetBSD support in some extent, but maybe there is need to fixup something on the kernel level.

The project is still not 100% accomplished but we are now very close to finishing everything in the domain of threads. I could torture the NetBSD kernel for few hours with a massive number of threads and events without a single crash or failure. On the other hand there are still likely some suspicious corner cases that need proper investigation. There are also some suspicious reports for crashes from syzkaller, the kernel fuzzer. Those still need to be promptly checked.

LLVM projects

I have attempted to change our original plan with LLD and instead of mutating the LLD behavior on target basis, write a dedicated LLD wrapper that tunes LLD for NetBSD. My patch is still in review. As an improvement over the previous ones, it wasn't immediately rejected... https://reviews.llvm.org/D69755.

I have upstreamed chunks of code with the following commits:

[compiler-rt] [msan] Correct the __libc_thr_keycreate prototype
[compiler-rt] [msan] Support POSIX iconv(3) on NetBSD 9.99.17+
[compiler-rt] Harmonize __sanitizer_addrinfo with the NetBSD headers
[compiler-rt] Sync NetBSD syscall hooks with 9.99.17

NetBSD distribution changes

I have switched the iconv(3) function prototype to POSIX-conformant form. The history of this function is documented in iconv(3) as follows:

STANDARDS
     iconv_open(), iconv_close(), and iconv() conform to IEEE Std 1003.1-2001
     ("POSIX.1").

     Historically, the definition of iconv has not been consistent across
     operating systems.  This is due to an unfortunate historical mistake,
     documented in this e-mail:
     https://www5.opengroup.org/sophocles2/show_mail.tpl?&source=L&listname=austin-group-l&id=7404.
     The standards page for the header file  defined the second
     argument of iconv() as char **, but the standards page for the iconv()
     implementation defined it as const char **.  The standards committee
     later chose to change the function definition to follow the header file
     definition (without const), even though the version with const is
     arguably more correct.  NetBSD used initially the const form.  It was
     decided to reject the committee's regression and become (technically)
     incompatible.

     This decision was changed in NetBSD 10 and the iconv() prototype was
     synchronized with the standard.

Meanwhile I fixed what was known to be effected in pkgsrc. Unfortunately Qt4/KDE4 had several build issues and this motivated me to fix its users for the new function through upgrades to the Qt5/KDE5 stack. Many dead packages without upgrade path were dropped from pkgsrc.

As there is a new Clang upgrade coming, I have implemented handlers for new UBSan reports: function_type_mismatch_v1() and implicit_conversion(). The first one is a new ABI for function_type_mismatch() and the second one is completely new.

GSoC Mentor Summit

I took part in the GSoC Mentor Summit in Munich and presented a talk titled "NetBSD version 9. What's new in store?".

Plan for the next milestone

Support Michal Gorny in reaching the milestone of passing all threading and debug register tests in LLDB.

This work was sponsored by The NetBSD Foundation.

http://netbsd.org/donations/#how-to-donate

Stabilization of the ptrace(2) threads

2019-10-10T10:24:10+00:00

I have introduced changes that make debuggers more reliable in threaded scenarios. Additionally, I have enhanced Leak Sanitizer support and introduced various improvements in the basesystem.

Threading support

Threads and synchronization in the kernel, in general, is an evergreen task of the kernel developers. The process of enhancing support for tracing multiple threads has been documented by Michal Gorny in his LLDB entry Threading support in LLDB continued.

Overall I have introduced these changes:

Separate suspend from userland (_lwp_suspend(2)) flag from suspend by a debugger (PT_SUSPEND). This removes one of the underlying problems of threading stability as a debuggee was able to accidentally unstop suspended thread. This property is needed whenever we want to trace a selection (typically single entity) of threads.
Store SIGTRAP event information inside siginfo_t, rather than in struct proc. A single signal can only be reported at the time to the debugger, and its context is no longer prone to be overwritten by concurrent threads.
Change that introduces restarts in functions notifying events for debuggers. There was a time window between registering an event by a thread, stopping the process and unlocking mutexes of the process; as another process could take the mutexes before being stopped and overwrite the event with its own data. Now each event routine for debugger checks whether a process is already stopping (or demising or no longer being tracked) and preserves the signal to be emitted locally in the context of the lwp local variable on the stack and continues stopping self as requested by the other LWP. Once the thread is awaken, it retries to emit the signal and deliver the event signal to the debugger.
Introduce PT_STOP, that combines kill(SIGSTOP) and ptrace(PT_CONTINUE,SIGSTOP) semantics in a single call. It works like:
- kill(SIGSTOP) for unstopped tracee
- ptrace(PT_CONTINUE,SIGSTOP) for stopped tracee
The child will be stopped and always possible to be waited (with wait(2) like calls).
For stopped tracee kill(SIGSTOP) has no effect. PT_CONTINUE+SIGSTOP cannot be used on an unstopped process (EBUSY).
This operation is modeled after PT_KILL that is similar for the SIGKILL call. While there, allow PT_KILL on unstopped traced child.
This operation is useful in an abnormal exit of a debugger from a signal handler, usually followed by waitpid(2) and ptrace(PT_DETACH).

For the sake of tracking the missed in action signals emitted by tracee, I have introduced the feature in NetBSD truss (as part of the picotrace repository) to register syscall entry (SCE) and syscall exit (SCX) calls and track missing SCE/SCX events that were never delivered. Unfortunately, the number of missing events was huge, even for simple 2-threaded applications.

    truss[2585] running for 22.205305922 seconds
    truss[2585] attached to child=759 ('firefox') for 22.204289369 seconds
    syscall                     seconds      calls     errors missed-sce missed-scx
    read                    0.048522952        609          0         54         76
    write                   0.044693735        487          0         35         66
    open                    0.002516815         18          0          5          5
    close                   0.001015263         17          0          9          6
    unlink                  0.001375463         13          0          3          0
    getpid                  0.093458089       1993          0         16         56
    geteuid                 0.000049301          1          0          0          1
    recvmsg                 0.343353019       4828       3685         90        112
    access                  0.001450653         12          3          5          4
    dup                     0.000570904         10          0          0          1
    munmap                  0.010375949         88          0          6          3
    mprotect                0.196781932       2251          0         11         62
    madvise                 0.049820002        430          0         11         18
    writev                  0.237488362       1507          0         76         67
    rename                  0.000379918          2          0          1          0
    mkdir                   0.000283846          2          2          1          2
    mmap                    0.033342935        481          0         15         40
    lseek                   0.003341775         62          0         25         24
    ftruncate               0.000507707          9          0          1          0
    __sysctl                0.000144506          2          0          0          0
    poll                   18.694195617       4531          0        106        191
    __sigprocmask14         0.001585329         20          0          0          2
    getcontext              0.000083238          1          0          0          0
    _lwp_create             0.000104646          1          0          0          0
    _lwp_self               0.001456718         22          0         24         79
    _lwp_unpark             0.035319633        607          0         14         39
    _lwp_unpark_all         0.020660377        250          0         38         50
    _lwp_setname            0.000118418          2          0          0          0
    __select50             15.125525493        637          0         82        125
    __gettimeofday50        3.279021049       2930          0         40        135
    __clock_gettime50      10.673311747      33132          0       1418       3003
    __stat50                0.006375356         52          3         12          5
    __fstat50               0.001490944         17          0          3          2
    __lstat50               0.000110906          1          0          1          0
    __getrusage50           0.008863815        109          0          7          1
    ___lwp_park60          62.720893458        964        251        454        453
                          -------------    -------    -------    -------    -------
                          111.638589870      56098       3944       2563       4628

With my kernel changes landed, the number of missed sce/scx events is down to zero (with exceptions to signals that e.g. never return such as the exit(2) call).

Once these changes settle in HEAD, I plan to backport them to NetBSD-9. I have already received feedback that GDB works much better now.

The kernel also has now more runtime asserts that validate correctness of the code paths.

Sanitizers

I've introduced a special preprocessor macro to detect LSan (__SANITIZE_LEAK__) and UBSan (__SANITIZE_UNDEFINED__) in GCC. The patches were submitted upstream to the GCC mailing list, in two patches (LSan + UBSan). Unfortunately, GCC does not see value in feature parity with LLVM and for the time being it will be a local NetBSD specific GCC extension. These macros are now integrated into the NetBSD public system headers, for use by the basesystem software.

The LSan macro is now used inside the LLVM codebase and the ps(1) program is the first user of it. The UBSan macro is now used to disable relaxed alignment on x86. While such code is still functional, it is not clean from undefined behavior as specified by C. This is especially needed in the kernel fuzzing process, as we can reduce noise from less interesting reports.

During the previous month a number of reports from kernel fuzzing were fixed. There is still more to go.

Almost all local patches needed for LSan were merged upstream. The last remaining local patch is scheduled for later as it is very invasive for all platforms and sanitizers. In the worst case we just have more false negatives in detection of leaks in specific scenarios.

Miscellaneous changes

I have fixed a regression in upstream GDB with SIGTTOU handling. This was an upstream bug fixed by Alan Hayward and cherry-picked by me. As a side effect, a certain environment setup would cause the tracer to sleep.

I have reverted the regression in changed in6_addr change. It appeased UBSan, but broke at least qemu networking. The regression was tracked down by Andreas Gustafsson and reported in the NetBSD's bug tracking system.

I have landed a patch that returns ELF loader dl_phdr_info information for dl_iterate_phdr(3). This synchronized the behavior with Linux, FreeBSD and OpenBSD and is used by sanitizers.

I have passed through core@ the patch to change the kevent::udata type from intptr_t to void*. The former is slightly more pedantic, but the latter is what is in all other kevent users and this mismatch of types affected specifically C++ users that needed special NetBSD-only workarounds.

I have marked git and hg meta files as ones to be ignored by cvs import. This was causing problems among people repackaging the NetBSD source code with other VCS software than CVS.

I keep working on getting GDB test-suite to run on NetBSD, I spent some time on getting fluent in the TCL programming language (as GDB uses dejagnu and TCL scripting). I have already fixed two bugs that affected NetBSD users in the TCL runtime: getaddrbyname_r and gethostbyaddr_r were falsely reported as available and picked on NetBSD, causing damage in operation. Fluency in TCL will allow me to be more efficient in addressing and debugging failing tests in GDB and likely reuse this knowledge in other fields useful for the project.

I made __CTASSERT a static assert again. Previously, this homegrown check for compile-time checks silently stopped working for C99 compilers supporting VLA (variable length array). It was caught by kUBSan that detected VLA of dynamic size of -1, that is still compatible but has unspecified runtime semantics. The new form is inspired by the Perl ctassert code and uses bit-field constant that enforces the assert to be effective again. Few misuses __CTASSERT, mostly in the Linux DRMKMS code, were fixed.

I have submitted a proposal to the C Working Group a proposal to add new methods for setting and getting the thread name.

Plan for the next milestone

Keep stabilizing the reliability debugging interfaces and get ATF and LLDB threading code reliably pass tests. Cover more scenarios with ptrace(2) in the ATF regression test-suite.

This work was sponsored by The NetBSD Foundation.

http://netbsd.org/donations/#how-to-donate

EuroBSDCon 2019

2019-09-25T08:40:08+00:00

Submitted by Maciej Grochowski.

This year EuroBSDCon took place in Lillehammer Norway. I had the pleasure to attend as a speaker with my talk about fuzzing the NetBSD filesystems.

Venue

Lillehammer is a ski resort, nestled amid very beautiful scenery between mountains and lakes, just two hours from Oslo. The conference took place in the Scandic Lillehammer Hotel, a little bit away from the downtown of Lillehammer, close to the Olympic Ski Jumps.

View from the Olympic Ski Jump

Talks

Every year, EuroBSDCon has a lot of interesting talks. Unfortunately, it is hard to attend all the interesting seminars, as many of them take place at the same time, so I won't be able to highlight all of them; accordingly, I gratefully acknowledge several organizations for handling the live streaming from every session.

Keynote: Embedded Ethics

The conference started with an excellent Keynote from Patricia Aas (ex. Opera/Cisco/Vivaldi, cur Turtlesec), about the Ethics in the IT industry. As a person who is familiar with the issues with the privacy and many different threads of abusing user data by the company, I have to say that this talk started the avalanche of different thoughts and reflections in my mind. To my surprise, I was not the only one to have such thoughts. This topic arose quite often during the rest of the conference through many conversations between different people. For those of you who didn't see it yet, I highly recommend that you do. The key takeaway is that we, the people who are building today's digital world, need to think about the implications of our work and decisions upon the users of our services. This topic is getting more complicated even as we think about it. However, Patricia come here with the strategy "Annoying as a Service" that can be simply used in every situation to at least not makes things worse...

Conference Talks

During the first day, there were a couple of interesting talks about NetBSD: "Improving modularity of NetBSD compat code", and mine, on "Fuzzing NetBSD Filesystems" [+ Taking NetBSD kernel bug roast to the next level: Kernel Fuzzers (quick A.D. 2019 overview) by Kamil Rytarowski]. As it turns out, there was another interesting talk about foundations of kernel fuzzing by Andrew Turner, in which he presented the connection between sanitizers, tracing modes and fuzzers. After the break, I attended the excellent talk "7th Edition Unix at 40" by Warner Losh -- if you love the history of Unix, this is a must-see. The first day finished with the social mixer. The second day started with one of my favourites of the entire conference: "Kernel TLS and TLS hardware offload" via Drew Gallatin and Hans Petter Selasky. In another room was also a very interesting seminar on Rust for System Programmers. The next session via Netflix folks was about NUMA optimizations in the FreeBSD Network stack, another interesting talk about the usage of BSD as a high-speed CDN serving about 200Gbps Video content(!). After that, I attended the session on The Future of OpenZFS via Allan Jude, where he showed the progress done in the collaboration of different OSes on ZFS Filesystem. The last sessions I attended were the "23 years of software side-channel attacks" by Colin, and the last one before the closing notes: "Unbound & FreeBSD: A true love story", by Pablo Carboni.

Highlights

Security: We can see clearly that the BSD community continues efforts for making BSDs more secure on various levels. This year we talked mostly about fuzzing, and in this area, it is impossible not to recognise NetBSD for great progress.
CDN use-case: Netflix contributions to FreeBSD make it a great system for CDN, year after year innovating and increasing the performance. I hope we will see more companies using BSDs as core for their CDN infrastructure.
ZFS: The filesystem has come a long way, despite being a project divided between different communities. Now thanks to the efforts of the developers, OpenZFS as a united community will be able to progress even faster and take advantage of projects that are using it. I believe the OpenZFS initiative is one of the most important steps taken by the community in many years.

Social Event

This year's social event took place in the Open Air Museum in Maihaugen, where we were able to see, preserved in excellent condition, parts of the Norwegian houses from the 19th century through the late 20th century. The fun part was that every house was open and you were able to go inside, some of them with people dressed up in the fashion of the same years, talking about the age. I very much enjoyed it, as it was a great opportunity to learn more about Norwegian culture and history.

The XX century city

XIX century school

Next Year!

The most important key point during closing notes is always: "where will the next EuroBSDCon take place?!" This year the guessing game was:

Beer will be cheaper than in Norway
[picture of Schnitzel]
Photo of...

Vienna!

Hope to see you all next year in Vienna!

LLVM santizers and GDB regression test suite

2019-09-03T15:00:09+00:00

As NetBSD-9 is branched, I have been asked to finish the LLVM sanitizer integration. This work is now accomplished and with MKLLVM=yes build option (by default off), the distribution will be populated with LLVM files for ASan, TSan, MSan, UBSan, libFuzzer, SafeStack and XRay.

I have also transplanted basesystem GDB patched to my GDB repository and managed to run the GDB regression test-suite.

NetBSD distribution changes

I have enhanced and imported my local MKSANITIZER code that makes whole distribution sanitization possible. Few real bugs were fixed and a number of patches were newly written to reflect the current NetBSD sources state. I have also merged another chunk of the fruits of the GSoC-2018 project with fuzzing the userland (by plusun@).

The following changes were committed to the sources:

ab7de18d0283 Cherry-pick upstream compiler-rt patches for LLVM sanitizers
966c62a34e30 Add LLVM sanitizers in the MKLLVM=yes build
8367b667adb9 telnetd: Stop defining the same variables concurrently in bss and data
fe72740f64bf fsck: Stop defining the same variable concurrently in bss and data
40e89e890d66 Fix build of t_ubsan/t_ubsanxx under MKSANITIZER
b71326fd7b67 Avoid symbol clashes in tests/usr.bin/id under MKSANITIZER
c581f2e39fa5 Avoid symbol clashes in fs/nfs/nfsservice under MKSANITIZER
030a4686a3c6 Avoid symbol clashes in bin/df under MKSANITIZER
fd9679f6e8b1 Avoid symbol clashes in usr.sbin/ypserv/ypserv under MKSANITIZER
5df2d7939ce3 Stop defining _rpcsvcdirty in bss and data
5fafbe8b8f64 Add missing extern declaration of ib_mach_emips in installboot
d134584be69a Add SANITIZER_RENAME_CLASSES in bsd.prog.mk
2d00d9b08eae Adapt tests/kernel/t_subr_prf for MKSANITIZER
ce54363fe452 Ship with sanitizer/lsan_interface.h for GCC 7
7bd5ee95e9a0 Ship with sanitizer/lsan_interface.h for LLVM 7
d8671fba7a78 Set NODEBUG for LLVM sanitizers
242cd44890a2 Add PAXCTL_FLAG rules for MKSANITIZER
5e80ab99d9ce Avoid symbol clashes in test/rump/modautoload/t_modautoload with sanitizers
e7ce7ecd9c2a sysctl: Add indirection of symbols to remove clash with sanitizers
231aea846aba traceroute: Add indirection of symbol to remove clash with sanitizers
8d85053f487c sockstat: Add indirection of symbols to remove clash with sanitizers
81b333ab151a netstat: Add indirection of symbols to remove clash with sanitizers
a472baefefe8 Correct the memset(3)'s third argument in i386 biosdisk.c
7e4e92115bc3 Add ATF c and c++ tests for TSan, MSan, libFuzzer
921ddc9bc97c Set NOSANITIZER in i386 ramdisk image
64361771c78d Enhance MKSANITIZER support
3b5608f80a2b Define target_not_supported_body() in TSan, MSan and libFuzzer tests
c27f4619d513 Avoids signedness bit shift in db_get_value()
680c5b3cc24f Fix LLVM sanitizer build by GCC (HAVE_LLVM=no)
4ecfbbba2f2a Rework the LLVM compiler_rt build rules
748813da5547 Correct the build rules of LLVM sanitizers
20e223156dee Enhance the support of LLVM sanitizers
0bb38eb2f20d Register syms.extra in LLVM sanitizer .syms files

Almost all of the mentioned commits were backported to NetBSD-9 and will land 9.0.

As a demo, I have crafted a writing on combining RUMPKERNEL, MKSANITIZER with the honggfuzz fuzzer: Rumpkernel assisted fuzzing of the NetBSD file system kernel code in userland.

GDB

I've merged NetBSD distribution downstream GDB patches into my local GDB tree and executed the regression tests (check-gdb):

[...]
Test run by kamil on Mon Sep  2 12:36:03 2019
Native configuration is x86_64-unknown-netbsd9.99

                === gdb tests ===

Schedule of variations:
    unix

[...]
                === gdb Summary ===

# of expected passes            54591
# of unexpected failures        3267
# of expected failures          35
# of unknown successes          3
# of known failures             59
# of unresolved testcases       29
# of untested testcases         141
# of unsupported tests          399

Full log is here.

This means that there are a lot of more tests and known failures than in 2017-09-05:

$ uname -a
NetBSD chieftec 8.99.2 NetBSD 8.99.2 (GENERIC) #0: Sat Sep  2 22:55:29 CEST 2017  root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64

Test run by kamil on Tue Sep  5 17:06:28 2017
Native configuration is x86_64--netbsd

                === gdb tests ===

Schedule of variations:
    unix

[...]
                === gdb Summary ===

# of expected passes            16453
# of unexpected failures        483
# of expected failures          9
# of known failures             28
# of unresolved testcases       17
# of untested testcases         41
# of unsupported tests          25

There are actually some regressions and a set of tests that fails probably due to environment differences like lack of gfortran at hand.

Full log is here

GSoC Mentoring

The Google Summer of Code programme reached the end. My mentees wrote successfully their final reports:

I'm also mentoring the AFL+KCOV work by Maciej Grochowski. Maciej will visit EuroBSDCon-2019 and speak about his work.

Add methods for setting and getting the thread name

I've reached out to the people from standards bodies and I'm working on defining the standard approach for setting and getting the thread name. I have received a proper ID of my proposal and I'm now supposted to submit the text in either PDF or HTML format.

This change will allow to manage the thread name with an uniform interface on all comforming platforms.

Plan for the next milestone

Keep enhancing GDB support. Keep detecting ptrace(2) bugs and addressing them.

This work was sponsored by The NetBSD Foundation.

http://netbsd.org/donations/#how-to-donate

Enchancing Syzkaller Support for NetBSD, Part 3

2019-08-27T19:37:58+00:00

Prepared by Siddharth Muralee(@R3x) as a part of Google Summer of Code’19

As a part of Google Summer of Code’19, I am working on improving the support for Syzkaller kernel fuzzer. Syzkaller is an unsupervised coverage-guided kernel fuzzer, that supports a variety of operating systems including NetBSD.

You can take a look through the first report to see the initial changes that we made and you can look at the second report to read about the initial support we added for fuzzing the network stack.

This report details the work done during the final coding period where the target was to improve the support for fuzzing the filesystem stack.

Filesystem fuzzing is a relatively less explored area. Syzkaller itself only has filesystem fuzzing support for Linux.

Analysis of the existing Linux setup

Filesystems are more complex fuzzing target than standalone system calls. To fuzz Filesystems we do have a standard operation like mount which comes with system call vector and an additional binary image of the filesystem itself. While normal syscalls generally have a size of a few bytes, sizes of real world Filesystem images is in order of Gigabytes or larger, however for fuzzing minimal size can be used which is in order of KB-MB. Since syzkaller uses a technique called as mutational fuzzing - where it mutates random parts of the input (according to specified guidelines), having a large input size causes delay due to higher I/O time.

Syzkaller deals with large images by disassembling them to non-zero chunks of the filesystem image. Syzkaller extracts the non-zero chunks and their offsets and stores it as separate segments and just before execution it writes all the chunks into the corresponding offsets - generating back the new/modified image.

Porting it to NetBSD

As an initial step towards filesystem fuzzing we decided to port the existing Linux approach of creating random segments to NetBSD. There are a few differences between the mounting process in both the operating systems - the most significant of them being the difference in the arguments to mount(2).

Linux:

int mount(const char *source, const char *target, const char *filesystemtype, unsigned long mountflags, const void *data);

The data argument is interpreted by the different filesystems. Typically it is a string of comma-separated options understood by this filesystem. mount(8) - shows possible arguments for each of the filesystem.

possible options for xfs filesystem in linux :

    wsync, noalign, swalloc, nouuid, mtpt, grpid, nogrpid, bsdgroups, 
    sysvgroups,norecovery, inode64, inode32, ikeep, noikeep,
    largeio, nolargeio, attr2, noattr2, filestreams, quota,
    noquota, lazytime, nolazytime, usrquota, grpquota, prjquota,
    uquota, gquota, pquota, uqnoenforce, gqnoenforce, pqnoenforce,
    qnoenforce, discard, nodiscard, dax, barrier, nobarrier, logbufs,
    biosize, sunit, swidth, logbsize, allocsize, logdev, rtdev

NetBSD:

Int mount(const char *type, const char *dir, int flags, void *data, size_t data_len);

The argument data describes the file system object to be mounted, and is data_len bytes long. data is a pointer to a structure that contains the type specific arguments to mount.

For FFS (one of the most common filesystems for NetBSD) - the arguments look like :

struct ufs_args {
        char      *fspec;   /* block special file to mount */
};

Currently, we have a pseudo syscall syz_mount_image which does the job of writing the mutated chunks of the filesystem into a file based on their offsets and later configuring the loop device using vndconfig(8) and mounting the filesystem image using mount(8).

Analysis of the current approach

One way to create mountable filesystems is to convert an existing filesystem image into a syzkaller pseudo grammar representation and then add it to the corpus so that syzkaller uses it for mutation and we have a proper image.

Some of the noted issues with syzkaller approach (as noted in "Fuzzing File Systems via Two-Dimensional Input Space Exploration) :

Lack of metadata knowledge - This may lead to corruption of filesystem specific aspects such as checksums.

Lack of Context awareness - Syzkaller isn't aware of the status of the filesystem image after a few operations are performed on it.

Steps Forward

We also spent some time researching possible options to solve the existing issues and developing an approach that would give us better results.

Image mutator approach

One possible way forward is to actually use a seed image (a working filesystem image) and write a mutator which would be aware of all the metadata in the image. The mutator should be also be able to recreate metadata components such as the checksum so that the image is mountable.

An existing implementation of such a mutator is JANUS which is a filesystem mutator written for Linux with inspiration from fsck.

Grammar based approach

Syzkaller uses a pseudo-formal grammar for representing arguments to syscalls. This grammar can also be modified to actually be able to properly generate filesystem images.

Writing grammar to represent a filesystem image is quite a daunting task and we are not yet sure if it is possible but it is the approach that we have planned to take up as of now.

Proper documentation detailing the structure of a filesystem image is rather scarce which has led me to actually go through filesystem code to figure out the type, uses and limits of a certain filesystem image. This data then has to be converted to syzkaller representation to be used for fuzzing.

One advantage of writing a grammar that would be able to generate mountable images is that we would be able to get more coverage than fuzzing with a seed image, since we are also creating new images instead of just mutating the same image.

I am currently working on learning the internals of FFS and trying to write a grammar definition which can properly generate filesystem images.

Miscellaneous Work

Meanwhile, I have also been working in parallel on improving the existing state of Syzkaller.

Add kernel compiled with KUBSAN for fuzzing

So far we only used a kernel compiled with KCOV and KASAN for fuzzing with syzkaller. We also decided to add support for syzkaller building a kernel with KUBSAN and KCOV. This would help us have an another dimension in the fuzzing process.

This required some changes in the build config. We had to remove the hardcoded kernel config and add support for building a kernel with a config passed to the fuzzer. This move would also help us to easily add support for upcoming sanitizers such as KMSAN.

Improve syscall descriptions

Improving system call descriptions is a constant ongoing work - I recently added support for fuzzing syscalls such as mount, fork and posix_spawn.

We are also planning to add support for fuzzing device drivers soon.

Relevant Links

Syzkaller Dashboard for NetBSD

Syzkaller repository on Github

NetBSD docs on setting up syzkaller

GSoC'19 proof of work repository

Summary

We have managed to meet most of the goals that we had planned for the GSoC project. Overall, I have had a wonderful summer with the NetBSD foundation and I look forward to working with them to complete the project.

Last but not least, I want to thank my mentors, @kamil and @cryo for their useful suggestions and guidance. I also thank Maciej for his insight and guidance which was very fundamental during the course of the project. I would also like to thank Dmitry Vyukov, Google for helping with any issues faced with regard to Syzkaller. Finally, thanks to Google to give me a good chance to work with NetBSD community.