This intends to be a quick reference style guide for writing code following the Linux kernel guidelines.

You need to follow these guidelines, otherwise, the Linux kernel community may reject your code or make you rewrite it. A uniform coding style is enforced to ensure simplified maintainership once the creator is not interested in it any more.

General stuff

  • Many of this recommendations steam from personal preference and experience, so don't take them as pure gospel. Use common sense.
  • Read `linux/Documentation/Submit*`; they contain most of the current community practice.

  • If you are modifying other's code, try to adapt to their style. If they style conflicts with these rules, you might be asked to change the old style to conform.
  • All files need a header with a copyright, contact information, license header, and basic documentation that shortly describes what the code does, entry point gives which are the entry points to the code in the file and a quick code flow description.
  • Document all functions, structs and constants using kernel-doc unless they are totally trivial. Format at kernel-doc. In a nutshell:

    /**
     * FUNCTIONORSTRUCTNAME - short action description
     *
     * @i2400m: description
     * @buf: description
     * ...
     *
     * Functional description
     */
    
  • Maximum line length is 80 chars. Period. Soft limit should be at 70, but never exceed 80.

  • Write code like this (pay attention to the spacing):
    static i1480_foo_bar_isolate(struct i1480 *i1480)
    {
            int isolated;
            struct resource *resource;
            ...
            resource = someresource_get(i1480, blah);
            if (unlikely(IS_ERR(resource))) {
                    result = PTR_ERR(resource);
                    goto error_resource_get;
            }
            isolated = i1480_isolation_get(i1480);
            if (isolated < 0) {
                    dev_err(i1480->dev, "cannot get isolation status: %d\n",
                            result);
                    goto error_isolation_get;
            }
            ....
            seed = i1480_seed_create();
            ....
            return 0;
    
    ...
    error_isolation_get:
            someresource_put(resource);
    error_resource_get:
    ...
            return result;
    }
    
    In a nutshell:
    • a function that returns error undoes all it did
    • check all your return values for errors
    • make sure your error messages make it easy to pinpoint where the error happened
    • put all the error handling off-line
    • follow religiously the spacing, tab size (8) and hard line length limits
  • This is wrong:
    void *ptr = &(somestruct->member);
    
    Go back and relearn your operator priority lessons; by inserting extra parenthesis you are only making the code harder to read. That code should look like:
    void *ptr = &somestruct->member;
    

Naming and namespace

  • Names in caps are reserved only for #defined constants, other constants or macros

  • Use i_can_read_this instead of icantreadthis or ICantReadThis.

  • Break up files by functionality; not too big, not too small. 300k is definitely too big. 80k is too. Some people might disagree with this, it is personal preference.
  • Do not pollute the global namespace. Set your functions and global data to be static unless it has to be shared with other files. Symbols needed by other modules should be EXPORT_SYMBOL_GPL() and of course, not static.

  • Name your functions, data types and globals consistently
    • prefix names (especially globals) with a common, short prefix (for example, for device drivers is easy just to get the device namd/number and come up with a short name; for example i1480.

    • for functions/macros/actions, use PREFIX_NAME_VERB or PREFIX_VERB_NAME and stick to that across your code. For example:

    i1480_fw_upload();
    i1480_fw_chunk_parse()
    

    NOTE: I personally prefer PREFIX_NAME_VERB, as PREFIX/NAME creates a naming hierarchy akin to subdirectories when you decompose your design in simpler and simpler objects.

  • 'char* buf' should be 'char *buf' for style consistency. The rationale behind this is that when you declare in a single line
    char* a, b, c;
    

    this actually means a is a char *. b and c are char. By going next to the var is more obvious, yadah, yadah...

Data types

  • Don't create your own constant set for something as obvious as success. If you have a i1480_SUCCESS symbol that means 0 and check against it all the time, your are automatically tagged as a windows developer, and you don't want that. For returning values from functions, this is the usual Linux kernel BKM:

    1. Return >= 0 for success; 0 is plain success; > 0 might be success with qualifiers (eg: meaning different kinds of success or returning a size or something).

    2. Return <0 for error and make the code fit the errno definitions in include/asm-generic/errno*.h. All the error checks will always be :

      if (result < 0)
               error_processing
      
      Again, the keys here are succinctness and easy to understand without getting extra context.
  • Avoid using typedef. structs, unions, etc, type has to be 'struct x'. Constructs such as:

    typedef struct something *something_handle;
    

    will be rejected. It is assumed that struct something; struct something *my_handle hides enough already while giving the user cues that he is not dealing with an integral type.

  • in structs, save some space when naming members; don't replicate the struct name for some; instead, use variable names for instances that reflect what is it; instead of:
    struct i1480 {
            int i1480_mode;
            ...
    };
    ...
    struct i1480 *priv = dev->priv;
    ...
    priv->i1480_mode = ACTIVE;
    ...
    
    use the easier to track around (without knowing way too many details)
    struct i1480 {
            int mode;
            ...
    };
    ...
    struct i1480 *i1480 = dev->priv;
    ...
    i1480->mode = i1480_ACTIVE;
    ...
    

Functions and Macros

  • Every function, unless it is absolutely obvious, has a function header describing what it does, the arguments and return values. Explain the intricacies of the implementation in the header, not in the code. The code should flow on its own.
  • Functions should be small and do a single thing well. Break them up in paragraphs. If it is longer than a couple of pages, consider breaking it up because it is too big already (sometimes this is impossible, and at this point is when you indicate why in the function header).
  • Avoid macros as much as possible, use static inline functions (they provide type checking).

  • If you need to use too forward declarations, something is wrong with the way you define your functions. Functions should follow a bottom up order, where the functions that another one calls are located above it in the file.

Checking into source control / producing / submitting patches

  1. Commits/changesets/patches need to be grouped up logically by functionality (easier for the reviewers). If you mix different things in a changeset, the reviewer needs to mentally split them and change context continuously. This makes the review more tedious and longer, which at the end is worse for you, as it takes longer to get it approved or rejected.

    See below for a simple method to break up changes when they get mixed up (as it always happens in normal development).

  2. Commits have to be small (easier for the reviewers)

  3. Commit messages always follow the same format:
    SUBSYTEM: short description of the changes
    
    Rationale for the changes (current status, why the change is needed, 
    how the new stuff is done) and a longer description of the changes if 
    needed.
    
    Note that documentation about how the thing works should be in the
    function documentation headers; it is ok to refer to it in the commit 
    message. Your commit message should not duplicate that.
    
    Signed-off-by: Author Name <email@address>
    
    • SUBSYSTEM: which part of the code this applies to (eg: net/tcp: TCP protocol). Give some name space to help understand to what part of the code the message applies.

    • short description: one liner, for the summary logs

    • RATIONALE: extremely important, no rationale, changeset rejected

    • Signed-off-by is used to track authorship and copyrights in the Linux kernel source. See Documentation/SubmittingPatches, line 12

  4. When possible (read as hardly an exception is granted) changesets have to build and work.

    Rationale: when something breaks and we use bisect to track which changeset introduced the problem, we'll need to test a range of changesets until we find the one that introduces the problem. If one of them fails to build or work, we can't tell if that one is the culprit.

Send patches inline. Attachments make it hard for would-be-reviewers to comment on your patch (by quoting it) and for mailing list archivers to keep them. Use tools such as git send-email to send your patches to the mailing list.

Breaking up changesets in logically related areas

When working normally, different unrelated areas get all mixed up; here are processes for breaking them down.

These methods are what I use and more or less work for me. If you have other methods or have suggestions, please post them!

Breaking up changesets with git

It is assumed that your current worktree has a set of changes spread all around the code.

  1. Clean the stash stack:
    $ git stash drop
    
  2. Save the current diff in a stash
    $ git stash save
    
    The work tree is reverted to the last commit and your modifications are kept in a stash.
  3. Save the diff to a file:
    $ git stash show -p > ../work.patch
    
  4. Edit the patch file (use an editor that understands patch files, so you can add/remove without breaking it). Remove chunks that are not related. When you have a set of diff chunks that are logically related, apply the patch:
    $ git apply --whitespace=fix ../work.patch
    
    Make sure it builds, test it, make sure it works. Run it through checkpatch
    $ git diff | ../somewhere/linux/scripts/checkpatch.pl --strict -
    
    fix the issues, run again...
  5. Commit
    $ git commit -avs 
    
  6. Apply the rest of the stash
    $ git stash apply
    
    • Fix any conflicts that might have arisen
  7. Create a work.patch again:
    $ git diff HEAD > ../work.patch
    
  8. Restore the worktree to the latest commit:
    $ git checkout -f
    
    go back to edit the patch (step 4). Rinse and repeat until no stash is left.

Breaking up changesets with mercurial

  1. Clone your working directory into a clean version without modifications
    $ hg clone worktree worktree.clean
    $ hg -R worktree.clean update -C        # Wipe all changes
    
  2. Generate a diff of the whole changeset (eg: changes.patch)

    $ hg -R worktree diff > changes.patch
    
  3. Edit the changes.patch file and remove all the chunks that don't pertain to a grouping of logically related changes. At the end, changes.patch should contain only modifications that are related to the same item.

  4. Import in the clean repository this patch:
    $ hg -R worktree.clean import changes.patch
    
  5. Now push that changeset from the clean repository into the work repository:
    $ hg -R worktree pull -u worktree.clean
    

Now that part of the modifications are merged; go to the original tree, update (hg -R worktree update) and start all over again in step #2. Repeat until the patch is left only with self-contained changes.

Utilities