Since its inception in 2017, GitGuardian has been advocating for improved code security, particularly for open-source code, which is highly vulnerable due to its exposure to the public.
GitHub, the largest open-source community, is a global hub for open-source code, and it’s also where the GitGuardian story started. The GitGuardian App eventually started gaining popularity, becoming the top security app on the GitHub marketplace.
Preventing secret leaks on GitHub's massive scale is a significant challenge with serious security implications for individuals and organizations across the world. We send over 5,000 pro-bono emails daily to alert contributors after detecting a hard-coded secret in their patches.
In Feb 2024, GitHub stepped up in protecting open source by enabling push protection by default for all public repositories, which is a big step forward in enhancing code hygiene. 🎉 On March 1, 2024, they initiated the deployment of push protection for all users. It may take a week or two to roll out to all accounts.
In this blog, we'll provide a summary of what to expect from this feature and its limitations. Our aim is to assist you in determining whether this protective layer is sufficient for your specific situation.
What is GitHub Push Protection?
Push protection prevents secret leaks by scanning for highly identifiable secrets before they are pushed.
“When a secret is detected in code, developers are prompted directly in their IDE or command line interface with remediation guidance to ensure that the secret is never exposed.”
Essentially what it means is that any leaky commit pushed to a remote branch will be blocked by GitHub, impeding the leaked credentials from sprawling on the Git server. The remote then refuses the push and responds with information about the leak, like this:
Developers will have the choice to either eliminate the secret from their commits or proceed if it's deemed safe.
It is important to note that previously, the push protection feature required activation through the dashboard to be effective. However, following this announcement by GitHub, it will now be enabled by default.
GitHub provides the commit contributor the option to bypass the push protection by following a URL. If a contributor bypasses a push protection block for a secret, GitHub:
- Creates an alert in the Security tab of the repository, adds the bypass event to the audit log,
- Sends an email alert to the organization or personal account owners, security managers, and repository administrators who are watching the repository, with a link to the secret and the reason why it was allowed.
What You Need to Keep in Mind
Push Protection and Local History Rewrites
When the push protection triggers, the secret needs to be removed from all the commits it appears in. From the developer's perspective, this creates a lot of friction, as rewriting the local commit history is not a trivial task.
The best for a frictionless developer experience is to introduce secret scanning at the pre-commit stage, which will stop the secret from entering the VCS (git in this case) in the first place, sparing the developer from tedious work (that can even worsen the situation).
No Git History Scanning
Push protection is a real-time detection mechanism that can effectively reduce the likelihood of a secret entering a remote Git branch. However, this can leave your organization with a false sense of security because you won't be able to detect secrets that were hard-coded in the past. This can leave your organization vulnerable, as it's not uncommon to see the historical scanning surface hundreds or even thousands of incidents, with many of them still exploitable.
To avoid any loophole in your security posture, we encourage you to assess your repositories' health by performing a historical scan of your repositories at least once, which is done automatically when integrating the GitGuardian platform with your sources.
Limitations of Secret Detection
According to GitHub,
“This feature proactively prevents leaks by scanning for secrets before 'git push' operations are accepted, and it works with 69 token types (API keys, private keys, secret keys, authentication tokens, access tokens, management certificates, credentials, and more) detectable with a low "false positive" detection rate.”
The list of detected token types contains popular service tokens such as AWS, Azure, and Stripe. But is it enough? If we look at the number of web APIs, it is estimated that in 2022 there were more than 24,000 in the world, and this number is growing very fast. New services appear every day, and some have the chance to gain in popularity at an explosive rate. Take, for example, the number of OpenAI API keys found in public commits in 2022:
Developer adoption is fast-paced, which means that it’s almost impossible to predict which service will be popular in the next 6 months from now. In fact, detecting new, previously unlisted tokens is a strong predictor of the emerging popularity of a given service provider.
Since the feature was announced last year, GitGuardian has already detected over 10 million leaked secrets on public repositories. With 420 detectors, GitGuardian offers broader coverage compared to GitHub, which has 254 detectors, and just 97 of them have push protection.
Not All Secrets Are Easily Identifiable
At first, limiting the detection capability to credentials with “a low false positive detection rate” sounds like a good idea. But there is a caveat. It means that many credentials are going to fall through the cracks. Why? Because not all credentials are easily identifiable.
In fact, In 2022, what we refer to as generic secrets accounted for no less than two-thirds (67%) of the secrets detected. (from the State of Secrets Sprawl 2023):
In short, limiting secret detection to a set of secrets that are almost 100% certain to be secrets leaves room for undetected leaks. And GitHub push protection focuses on specific secrets, missing generic ones. It's important to keep this in mind as it's worse to assume complete immunity against a vulnerability only to realize later that it was only partially the case.
Ineffective for Private Repositories Made Public
Finally, an often overlooked type of leak occurs when a private or internal repository is intentionally or unintentionally made public. This is a sensitive event as it exposes the entire repository history where credentials could be discovered. Since it's disconnected from the git flow, only historical analysis of all the commits from all the branches of a repository can detect secret leaks.
It's important to note that push protection is ineffective if a private repository is made public.
Push scan has a size limit
GitHub's push protection feature imposes a size limit, meaning that if a push exceeds a certain threshold, typically 5 seconds, the system may timeout, resulting in missed scans. With millions of developers on the platform, GitHub opts not to block users during lengthy pushes to avoid overwhelming them. As a result, developers may bypass the push process in such cases.
Conclusion
Although push protection is a great step towards securing open-source repositories, it's important to acknowledge the gaps that remain uncovered. It's crucial to avoid assuming complete protection against the vast attack surface on GitHub at the click of a button.
A multi-layered approach to security is always beneficial, so we encourage combining push protection with pre-commit hook protection like installing ggshield. Additionally, history scanning is a separate field that requires Public Monitoring for proactive threat-hunting purposes. For a more comprehensive comparison between GitHub Advanced Security and GitGuardian's enterprise-level capabilities, check out this page.