GSoC 2019 Report: Adding NetBSD KNF to clang-format, Final


August 24, 2019 posted by Michał Górny

This report was prepared by Manikishan Ghantasala as a part of Google Summer of Code 2019

This is the third and final report of the project Add KNF (NetBSD style) clang-format configuration that I have been doing as a part of Google Summer of Code (GSoC) ‘19 with the NetBSD.

You can refer to the first and second reports here:

  1. Adding NetBSD KNF to clang-format, Part 1
  2. Adding NetBSD KNF to clang-format, Part 2

About the project

ClangFormat is a set of tools to format C/C++/Java/JavaScript/Objective-C/Protobuf code. It is built on top of LibFormat to support workflow in various ways including a standalone tool called clang-format, and editor integrations. It supports a few built-in CodingStyles that include: LLVM, Google, Chromium, Mozilla, Webkit. When the desired code formatting style is different from the available options, the style can be customized using a configuration file. The aim of this project is to add NetBSD KNF support to clang-format and new styles to libFormat that support NetBSD’s style of coding. This would allow us to format NetBSD code by passing `-style=NetBSD` as an argument.

How to use clang-format

While using clang-format one can choose a style from the predefined styles or create a custom style by configuring specific style options.

Use the following command if you are using one of the predefined styles

clang-format filename -style=<Name of the style>

Configuring style with clang-format

clang-format supports two ways to provide custom style options: directly specify style name in the -style= command line option or use -style=file and put style configuration in a .clang-format or _clang-format file in the project’s top directory.

Check Clang-Format Style Options to know how different Style Options works and how to use them.

When specifying configuration in the -style= option, the same configuration is applied for all input files. The format of the configuration is:

-style=’{key1: value1, key2: value2, ...}'

The .clang-format file uses YAML format. An easy way to get a valid .clang-format file containing all configuration options of a certain predefined style is:

clang-format -style=llvm -dump-config > .clang-format

After making required changes to the .clang-format file it can be used as a custom Style by:

clang-format <filename> -style=file

Changes made to clang-format

The following changes were made to the clang-format as a part of adding NetBSD KNF:

New Style options added:

  1. BitFieldDeclarationsOnePerLine
  2. AlignConsecutiveListElements
  3. Modifications made to existing styles:

    1. Modified SortIncludes  and IncludeCategories to support NetBSD like includes.
    2. Modified SpacesBeforeTrailingComments to support block comments.

     The new NetBSD Style Configurations:

    This is the final configurations for clang-format with modified changes to support NetBSD KNF.

        
            AlignTrailingComments: true
            AlwaysBreakAfterReturnType: All
            AlignConsecutiveMacros: true
            AlignConsecutiveLists: true
            BitFieldDeclarationsOnePerLine: true
            BreakBeforeBraces: Mozilla
            ColumnLimit: 80
            ContinuationIndentWidth: 4
            Cpp11BracedListStyle: false
            FixNamespaceComments: true
            IndentCaseLabels: false
            IndentWidth: 8
            IncludeBlocks: Regroup
            IncludeCategories:
             - Regex: '^<sys/param\.h>'
               Priority: 1
               SortPriority: 0
             - Regex: '^<sys/types\.h>'
               Priority: 1
               SortPriority: 1
             - Regex: '^<sys.*/'
               Priority: 1
               SortPriority: 2
             - Regex: '^<uvm/'
               Priority: 2
               SortPriority: 3
             - Regex: '^<machine/'
               Priority: 3
               SortPriority: 4
             - Regex: '^<dev/'
               Priority: 4
               SortPriority: 5
             - Regex: '^<net.*/'
               Priority: 5
               SortPriority: 6
             - Regex: '^<protocols/'
               Priority: 5
               SortPriority: 7
             - Regex: '^<(fs|miscfs|msdosfs|nfs|ufs)/'
               Priority: 6
               SortPriority: 8
             - Regex: '^<(x86|amd64|i386|xen)/'
               Priority: 7
               SortPriority: 8
             - Regex: '^<path'
               Priority: 9
               SortPriority: 11
             - Regex: '^<[^/].*\.h'
               Priority: 8
               SortPriority: 10
             - Regex: '^\".*\.h\"'
               Priority: 10
               SortPriority: 12   
            SortIncludes: true
            SpacesBeforeCpp11BracedList: true
            SpacesBeforeTrailingComments: 4
            TabWidth: 8
            UseTab: Always
    
        
    

    Status of each Style Option

    Styles Ready to Merge:

    1.Modified SortIncludes and IncludeCategories:

         Patch: https://reviews.llvm.org/D64695

    Styles needing revision

    1.BitFieldDeclarationsOnePerLine:

        Patch: https://reviews.llvm.org/D63062

        Bugs: 1 

     2.SpacesBeforeTrailingComments supports Block Comments:

        Patch: https://reviews.llvm.org/D65648

        Remark:  I have to discuss more on the cases that Block comments are used but where the Spaces are not to be added before them.

    WIP Style

    1.AlignConsecutiveListElements:

        Commit: https://github.com/sh4nnu/clang/commit/4b4cd45a5f3d211008763f1c0235a22352faa81e

        Bugs: 1

    About Styles

    BitfieldDeclarationsOnePerLine:

        Patch: https://reviews.llvm.org/D63062 

        This style lines up BitField declarations on consecutive lines with correct indentation.

    Input:
        
                unsigned int bas :3, hh : 4, jjj : 8;
                
                
                unsigned int baz:1,
                                fuz:5,
                    zap:2;
        
    
    Output:
    
                unsigned  int bas : 3, 
                            hh  : 4, 
                            jjj : 8;
                
                
                unsigned int baz:1,
                            fuz:5,
                            zap:2;
    
    

    Bug: Indentation breaks in the presence of block comments in between.

    Input:

            
                        unsigned  int bas : 3, /* foo */
                                      hh  : 4, /* bar */
                                      jjj : 8;
                        
            
            

    Output:

            
                        unsigned  int bas : 3, /* foo */
                            hh  : 4, /* bar */
                            jjj : 8;
    
            
            

    Modification for SortIncludes and IncludeCategories:

        Patch: https://reviews.llvm.org/D64695

        Status: Accepted, and ready to land.

    Clang-format has a style option named SortIncludes which sorts the includes in alphabetical order.The IncludeCategories Style allows us to define a custom order for sorting the includes.

    It supports POSIX extended regular expressions to assign Categories for includes.

    The SortIncludes then sorts the #includes first according to increasing category number and then lexically within each category. When IncludeBlocks  is set to Regroup  merge multiple #includes blocks together and sort as one. Then split into groups based on category priority.

    The problem arises when you want to define the order within each category, which is not supported. In this modification a new field named SortPriority is added, which is optional.

    The #includes matching the regexs are sorted according to the values of SortPriority, and Regrouping after sorting is done according to the values of Priority. If SortPriority is not defined it is set to the value of Priority as a default value.

    Example
        
            IncludeCategories:
             - Regex: ‘<^c/’
               Priority: 1
               SortPriority: 0
            
             - Regex: ‘^<(a|b)/’
               Priority: 1
               SortPriority: 1
            
             - Regex: ‘^<(foo)/’
               Priority: 2
          
             - Regex: ‘.*’
               Priority: 3
               
    
        
    
    Input
        
                #include "exe.h"
                #include <a/dee.h>
                #include <foo/b.h>
                #include <a/bee.h>
                #include <exc.h>
                #include <b/dee.h>
                #include <c/abc.h>
                #include <foo/a.h>
        
    
    Output
        
                #include <c/abc.h>
                #include <a/bee.h>
                #include <a/dee.h>
                #include <b/dee.h>
                
                #include <foo/a.h>
                #include <foo/b.h>
                
                #include <exc.h>
                #include "exe.h"
    
        
    

    As you can observe in the above example the #includes are grouped by different priority and were sorted by different priority. Introduction of this new patch doesn’t affect the old configurations as it will work as the same old SortIncludes if sortPriority is not defined.

    Refer to Report 2 for detailed examples on this.

    Modification for SpacesBeforeTrailingComments

        Patch: https://reviews.llvm.org/D64695

    The SpacesBeforeTrailingComments is modified to support Block Comments which was used to support only line comments. The reason for this is block comments have different usage patterns and different exceptional cases. I have tried to exclude cases where some tests doesn’t support spaces before block comments. I have been discussing in the community for getting to know which cases should be included, and which to exclude.

    Cases that were excluded due to failing tests:

    • If it is a Preprocessor directive,
    • If it is followed by a LeftParenthesis
    • And if it is after a Template closer

    AlignConsecutiveListElements

        Status: Work In Progress

    This is a new style that aligns elements of consecutive lists in a nested list. The Style is still in work in progress. There are few cases that needed to be covered and fix few bugs.

    Input:
        
                keys[] = {
                    {"all", f_all, 0 },
                    { "cbreak", f_cbreak, F_OFFOK },
                    {"cols", f_columns, F_NEEDARG },
                    { "columns", f_columns, F_NEEDARG },
                 };
        
    
    Output:
        
                keys[] = { { "all",      f_all,        0 },
                           { "cbreak",   _cbreak,      F_OFFOK },
                           { "cols",     f_columns,    F_NEEDARG }, 
                           { "columns",  f_columns,    F_NEEDARG },
                         };
    
        
    
    Work to be done:

    This style option aligns list declarations that are nested inside a list, I would also like to extend this style to align individual single line list declarations that are consecutive.

    The problem with this style is the case in which there can be different number of elements for each individual.

    Example:
        
                keys[] =  { "all",        f_all,        0 };
                keys2[] = { "cbreak",     _cbreak,      F_OFFOK };
                keys3[] = { "cols",       f_columns,    F_NEEDARG,     7 };
                keys4[] = { "columns",    f_columns };
        
    

    Future Work

    Some Style Options that were introduced during this GSoC were made in order to meet all the cases in NetBSD KNF. So they may need some revisions with respect to other languages and coding styles that clang-format supports. I will continue working on this project even after the GSoC period on the style options that are yet to be merged and add new style options if necessary and will get the NetBSD Style merged with upstream which is the final deliverable for the project. I would like to take up the responsibility of maintaining the “NetBSD KNF” support to clang-format.

    Summary

    Even though officially the GSoC’19 coding period is over, I definitely look forward to keep contributing to this project. This summer has had me digging a lot into the code from CLANG and NetBSD for references for creating or modifying the Style Options. I am pretty much interested to work with NetBSD again, I like being in the community and I would like to improve my skills and learn more about Operating Systems by contributing to this organisation.

    I would like to thank my mentors Michal and Christos for their constant support and patient guidance. A huge thanks to both the NetBSD and LLVM community who have been supportive and have helped me whenever I have had trouble. Finally a huge thanks to Google for providing me this opportunity.

    [1 comment]

     



    Comments:

    Not sure what exactly is causing this but the entry for this post in the RSS feed, is causing `newsboat` to get stuck at 100% CPU usage. Using a feed validator may help. Cheers, rjc

    Posted by rjc on August 27, 2019 at 12:21 PM UTC #

    Post a Comment:
    Comments are closed for this entry.