Sofien Mzabi


Graduated from Ecole Centrale Paris
Fullstack software engineer | GitGuardian | PQL team Tech Lead

Introduction

Most developers hate doing things that could be automated.

As emphasized in this tweet, we often have to accept that we cannot do it. Fortunately, in the case of code reviews, a lot of things can indeed be automated. As my previous CTO told me once

Life is too short to review spaces!

Between the fingers of the developers and the eyes of the reviewers there are two main steps where this automated review process can be done:

  • as pre-commit hooks
  • as CI jobs

In this article, we’ll focus on the pre-commit step. We’ll see how to install and setup pre-commit hooks and we’ll list the top 8 hooks we use at GitGuardian

How to set up commit hooks

From the git book, git hooks are a way to fire off custom scripts when certain important actions occur.

In the case of pre-commit hooks - as its name suggests - scripts are run just before the commit is created, allowing us to block it if it doesn’t meet our requirements. The main advantage of launching scripts at this step is that they can detect problems before they even enter the version control system, letting us fix them easily, or even automatically fix them.

At GitGuardian we use pre-commit which is a multi-language package manager for pre-commit hooks written in Python. It makes it really easy to install and share the hooks across our organization. You’ll find good alternatives written in other languages like husky  in javascript for example.

To setup

  1. Add pre-commit in your requirements.txt or in your Pipfile (in dev section).
  2. Add a pre-commit configuration file .pre-commit-config.yaml with the list of hooks you want. Here is an example from the documentation:
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v2.3.0
    hooks:
    -   id: check-yaml
    -   id: end-of-file-fixer
    -   id: trailing-whitespace
-   repo: https://github.com/psf/black
    rev: 21.12b0
    hooks:
    -   id: black

You can find here a list of common hooks.

3. Run pre-commit install in your Python env.

Here is a video to install pre-commit from pip and install a GitGuardian hook for example.

That’s it! From now on, when you’ll run git commit all hooks will be launched.

The pre-commit hooks we use at GitGuardian

Let’s begin with the formatter hooks. As the title of this article suggests, the last thing we want when reviewing code is to fatigue ourselves by focusing on formatting. This is why we installed the following hooks:

flake8

  - repo: https://github.com/PyCQA/flake8
    rev: 4.0.1
    hooks:
      - id: flake8
        args: [--config, backend/setup.cfg]
        additional_dependencies: [ggflake8==1.2.1]

flake8 parses the modified python files to make sure that the PEP8 guidelines are followed and block the commit if it’s not the case. On top of it, we developed our own flake8 plugin that we named  ggflake8 to enforce a set of custom rules like:

  • all functions of 20 lines or more must have a docstring
  • function with 3 or more arguments must use named arguments
  • Tests docstrings must follow the Gherkin ”GIVEN/WHEN/THEN” format.

black

  - repo: https://github.com/psf/black
    rev: 22.3.0
    hooks:
      - id: black
        args: [--config, backend/pyproject.toml]

We chose to add this strongly opinionated formatter on top of flake8 to remove all discussion about formating. As their documentation says:

Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting. You will save time and mental energy for more important matters.

Other good alternatives include pylint and autopep8

isort

  - repo: https://github.com/pycqa/isort
    rev: 5.10.1
    hooks:
      - id: isort
        args: [--settings-path, backend/pyproject.toml]

As their documentation says:  “isort your imports, so you don't have to”. It’s a handy Python utility that will take care of formatting the imports by sorting them alphabetically and separating them by sections and by type. One less thing to worry about!

prettier

  - repo: https://github.com/pre-commit/mirrors-prettier
    rev: v2.5.1
    hooks:
      - id: prettier

prettier and eslint are used to format our JSON, YAML, and markdown files.

check-*

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
      - id: check-json
      - id: check-yaml
      - id: check-added-large-files

The first set of hooks checks the syntax of JSON and YAML files while the check-added-large-files ensure that no one commits a huge file by mistake.

commitizen

  - repo: https://github.com/Woile/commitizen
    rev: v2.20.3
    hooks:
      - id: commitizen
        stages: [commit-msg]

commitizen makes sure our commit messages meet our company requirements, which is a format derived from semantic-release  where we require to also put the related GitLab issue’s number. Here is an example of a valid GitGuardian commit message:

chore(pre-commit): #2345 add commitizen hook

codespell

    - repo: https://github.com/codespell-project/codespell
      rev: v2.1.0
      hooks:
        - id: codespell

codespell  checks for typos. We chose this tool because it is based on a list of common typos, which reduces the number of false positives to a minimum.

It turned out to be a very useful tool: what a relief not to have to reject your colleague's MR because of a minor typo!

ggshield

  - repo: https://github.com/gitguardian/gg-shield
    rev: v1.12.0
    hooks:
      - id: ggshield

How silly would it be to not use our own software?

Pre-commit hooks are also a great place to run security tests. As with all tests, the sooner problems are detected the better. This is especially true for security issues, which can have disastrous impacts.

ggshield is one of the tools we develop at GitGuardian to help secure the codebase. Integrated as a hook it will scan the content of the git patch to make sure it does not contains any secret like an API token.

Usage

Now that we have our pre-commit hooks installed and setup, they will be run every time we try to commit:

Hooks run following a commit (skipped here because no files)

But if for any reason you want to skip one or all hooks you can easily do so

  • simply add the -n argument: git commit -m "message" -n
  • to skip only one hook use: SKIP=flake8 git commit -m "message"

Conclusion

Pre-commit hooks are a must-have in any project because they are so easy to set up and offer a huge value. Having used them once, I would say - in my very personal opinion - that it would feel almost as crazy not using them as not using Git! (exaggerating it a bit, but you get the idea ;) )

Nevertheless, this tool is not infallible as it can be skipped easily or not be installed at all. That is why it is important to maintain CI server-side tests and jobs, especially the security-related ones. Pre-commit hooks and CI jobs are complementary. It also shows that for security tests, a complementary solution that would scan the VCS server-side is still necessary.