Introduction
Most developers hate doing things that could be automated.
As emphasized in this tweet, we often have to accept that we cannot do it. Fortunately, in the case of code reviews, a lot of things can indeed be automated. As my previous CTO told me once
Life is too short to review spaces!
Between the fingers of the developers and the eyes of the reviewers there are two main steps where this automated review process can be done:
- as pre-commit hooks
- as CI jobs
In this article, we’ll focus on the pre-commit step. We’ll see how to install and setup pre-commit hooks and we’ll list the top 8 hooks we use at GitGuardian
How to set up commit hooks
From the git book, git hooks are a way to fire off custom scripts when certain important actions occur.
In the case of pre-commit hooks - as its name suggests - scripts are run just before the commit is created, allowing us to block it if it doesn’t meet our requirements. The main advantage of launching scripts at this step is that they can detect problems before they even enter the version control system, letting us fix them easily, or even automatically fix them.
At GitGuardian we use pre-commit
which is a multi-language package manager for pre-commit hooks written in Python. It makes it really easy to install and share the hooks across our organization. You’ll find good alternatives written in other languages like husky
in javascript for example.
To setup
- Add
pre-commit
in your requirements.txt or in your Pipfile (in dev section). - Add a pre-commit configuration file
.pre-commit-config.yaml
with the list of hooks you want. Here is an example from the documentation:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/psf/black
rev: 21.12b0
hooks:
- id: black
You can find here a list of common hooks.
3. Run pre-commit install
in your Python env.
Here is a video to install pre-commit
from pip
and install a GitGuardian hook for example.
That’s it! From now on, when you’ll run git commit
all hooks will be launched.
The pre-commit hooks we use at GitGuardian
Let’s begin with the formatter hooks. As the title of this article suggests, the last thing we want when reviewing code is to fatigue ourselves by focusing on formatting. This is why we installed the following hooks:
flake8
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
hooks:
- id: flake8
args: [--config, backend/setup.cfg]
additional_dependencies: [ggflake8==1.2.1]
flake8
parses the modified python files to make sure that the PEP8 guidelines are followed and block the commit if it’s not the case. On top of it, we developed our own flake8
plugin that we named ggflake8
to enforce a set of custom rules like:
- all functions of 20 lines or more must have a docstring
- function with 3 or more arguments must use named arguments
- Tests docstrings must follow the
Gherkin
”GIVEN/WHEN/THEN” format.
black
- repo: https://github.com/psf/black
rev: 22.3.0
hooks:
- id: black
args: [--config, backend/pyproject.toml]
We chose to add this strongly opinionated formatter on top of flake8 to remove all discussion about formating. As their documentation says:
Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle
nagging about formatting. You will save time and mental energy for more important matters.
Other good alternatives include pylint
and autopep8
isort
- repo: https://github.com/pycqa/isort
rev: 5.10.1
hooks:
- id: isort
args: [--settings-path, backend/pyproject.toml]
As their documentation says: “isort your imports, so you don't have to”. It’s a handy Python utility that will take care of formatting the imports by sorting them alphabetically and separating them by sections and by type. One less thing to worry about!
prettier
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.5.1
hooks:
- id: prettier
prettier
and eslint
are used to format our JSON, YAML, and markdown files.
check-*
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.1.0
hooks:
- id: check-json
- id: check-yaml
- id: check-added-large-files
The first set of hooks checks the syntax of JSON and YAML files while the check-added-large-files
ensure that no one commits a huge file by mistake.
commitizen
- repo: https://github.com/Woile/commitizen
rev: v2.20.3
hooks:
- id: commitizen
stages: [commit-msg]
commitizen
makes sure our commit messages meet our company requirements, which is a format derived from semantic-release
where we require to also put the related GitLab issue’s number. Here is an example of a valid GitGuardian commit message:
chore(pre-commit): #2345 add commitizen hook
codespell
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
codespell
checks for typos. We chose this tool because it is based on a list of common typos, which reduces the number of false positives to a minimum.
It turned out to be a very useful tool: what a relief not to have to reject your colleague's MR because of a minor typo!
ggshield
- repo: https://github.com/gitguardian/gg-shield
rev: v1.12.0
hooks:
- id: ggshield
How silly would it be to not use our own software?
Pre-commit hooks are also a great place to run security tests. As with all tests, the sooner problems are detected the better. This is especially true for security issues, which can have disastrous impacts.
ggshield
is one of the tools we develop at GitGuardian to help secure the codebase. Integrated as a hook it will scan the content of the git patch to make sure it does not contains any secret like an API token.
Usage
Now that we have our pre-commit hooks installed and setup, they will be run every time we try to commit:
But if for any reason you want to skip one or all hooks you can easily do so
- simply add the
-n
argument:git commit -m "message" -n
- to skip only one hook use:
SKIP=flake8 git commit -m "message"
Conclusion
Pre-commit hooks are a must-have in any project because they are so easy to set up and offer a huge value. Having used them once, I would say - in my very personal opinion - that it would feel almost as crazy not using them as not using Git! (exaggerating it a bit, but you get the idea ;) )
Nevertheless, this tool is not infallible as it can be skipped easily or not be installed at all. That is why it is important to maintain CI server-side tests and jobs, especially the security-related ones. Pre-commit hooks and CI jobs are complementary. It also shows that for security tests, a complementary solution that would scan the VCS server-side is still necessary.