Understanding secret sprawl and the attack surface

This is the first in a series of articles that will take a deep dive into secrets within source code:

How secrets like API keys and credentials sprawl
Why secrets in git can be especially problematic
How secrets detection algorithms work

In this article, we will look at the concept of secret sprawl, the unwanted distribution of secrets through multiple systems, and how we can prevent it.

What is a secret and how do we use them?

To really understand how secrets sprawl, we first need to understand what a secret is, and how we use secrets.

Secrets in software development
A secret can be any sensitive data that we want to keep private. When discussing secrets in the context of software development, secrets generally refer to digital authentication credentials that grant access to services, systems and data. These are most commonly API keys, usernames and passwords, or security certificates.

How secrets are used? Applications are no longer standalone monoliths, they now rely on thousands of independent building blocks: cloud infrastructure, databases, SaaS components such as Stripe, Slack, HubSpot… This is a significant shift in software development. Secrets are what tie together these different building blocks of a single application by creating a secure connection between each component.

Building applications as clusters of independent services provides huge advantages when compared to the monolithic approach. This includes the ability to independently update services, rapidly scale applications and offload development work to dedicated external services. However, the trade-off is that we now need to manage hundreds, even thousands of secrets for each component and these secrets are the crown jewels of our organizations. Secrets grant access to the most sensitive systems.

Monolith architecture https://twitter.com/sebiwicb

The secret conundrum

Because these secrets tie together each component of an application, developers need access to these secrets to build, connect, deploy and test applications. This creates a dilemma because secrets are extremely sensitive, yet they need to be accessible to developers, applications and infrastructure.

Tackling this dilemma needs well-considered and often complex secrets management systems and policies. If not treated with extreme consideration and care, your secrets will easily sprawl through many different systems.

How secrets sprawl across networks

If a person wants to get from point A to point B, it is only obvious that they choose the path of least resistance to get there. Let’s face it: we are not going to pick a path full of obstacles if we can avoid it.

This can be applied to handling secrets. Keeping secrets encrypted and tightly wrapped makes it harder for developers to both access and distribute them. This can lead us to choose the path of least resistance when handling them which may include hardcoding them into source code, distributing them through email or messaging systems like Slack, saving them directly into config files and storing them inside internal wikis.

The danger of this may not immediately be apparent as all these systems still have some level of access control, but once secrets start to enter different systems you lose both:

Control over where your secrets end up and who has access.
Visibility over where your secrets are.

Why is secret sprawl dangerous: the “attack surface”

If we take the example of hard coding secrets into source code. That code could be shared in a message on Slack, uploaded into a git repository, cloned onto multiple different professional and personal workstations, forked into a different project, included inside a package manager and ultimately could end up accessible to a malevolent actor.

Everyone that has access to those systems now has access to the secrets inside. More concerning though is that you have no visibility over where the secrets will ultimately end up, and you might not even know when they are compromised.

Even if secrets don’t end up on public internet space (for example inside a public git repository), they should still be considered compromised if they exist within any unsecure systems anywhere. Consider credit card numbers stored in plain text, would you include them in a git repository? Share them on Slack? Secrets are just as, if not more, sensitive.

When secrets are sprawled through multiple systems it increases what is referred to as the ‘attack surface’ this is the amount of points where an unauthorized user could gain access to your systems or data. In the case of secret sprawl, each time a secret enters another system it is another point where an attacker could gain access to your secrets.

Think of secrets as doors into your organization, the more doors, the more chance someone can find them. These doors can not only potentially be used to access sensitive data or systems. They could provide access to additional systems and be used laterally between them. In other words: one secret can unlock multiple doors in your organization.

What makes unauthorized access through secrets especially concerning is that once someone is authenticated through a secret, they act and appear to be a valid user. This makes it very hard to detect and allows an attacker to hide or squat inside systems undetected for long periods of time.

Preventing secret sprawl

There is no magic solution I can offer in this section but the prevention of secret sprawl centers around four connected elements:

Implementation of policies and best practices when handling secrets
Fine grain visibility into systems and services
Enforcement of policies and practices
Secure storage of secrets

Implementing automated secrets detection and developer-first remediation.

Creating policies around the handling of secrets and using coding best practices are important steps in combatting secret sprawl, but these are useless if not enforced.

The reality is most developers and organizations are operating blind, with no visibility into their systems which results in no visibility around practices and policies. When you cannot see the problem, you cannot enforce the solution. This is why combatting secret sprawl starts with fine grain visibility of your systems.

GitGuardian offers high precision & high recall secrets detection in systems such as git. Allowing both historical & incremental scanning and alerting the developer & application security team in real-time to make sure exposed credentials are properly revoked, secrets policies followed and compliance standards met.

GitGuardian's secrets detection solution

Implement Secrets Scanning with GitGuardian

Store secrets securely with tight access control.

Once you gain visibility over your systems, you need to be able to store your secrets in a secure location and wrap it with tight access control. The need of this system really differs between projects. What is important is that your secrets remain encrypted at rest and in transit.

For large projects using a ‘secrets as a service’ product like HashiCorp vault which not only securely stores your secrets with high-level access control. It enables dynamic secrets, secrets that auto-rotate giving you greater control, it also provides access logs. These systems are very resource-heavy to set up and operate. (Also see AWS secrets manager).

Smaller teams may consider encrypting secrets and storing them within git providing easy distribution. You could use an open-source solution like git-secret. This requires little resources but introduces a master key that can be hard to distribute and has no access logs.

Wrap up

Secrets, like API keys, security certificates and credentials are highly sensitive because they provide access to sensitive data and systems. These secrets in the wrong hands can allow an attacker to remain inside systems undetected.

Because secrets are so widely used in the modern software development life cycle, it is easy for them to sprawl through git repositories, messaging systems, infrastructure and workstations.

This increases the attackable area, making it easier for an attacker to discover secrets and even move laterally through systems. To prevent secret sprawl developers should consider investing in education, using automated secrets detection products like GitGuardian and secret management systems like Hashicorp Vault.

The next blog will take a deep dive into secrets within git and why it is such a plague.