Machine Identity Management: Strengthen Secrets Security

👉

TL;DR: Machine identity management is now critical as machine identities vastly outnumber human ones, driving secrets sprawl and risk. Hard-coded API keys, certificates, and tokens are frequently leaked and rarely rotated.
Effective security demands automated discovery, inventory, and frequent rotation of all machine credentials.
Adopting Zero Trust principles and leveraging integrated solutions like GitGuardian and CyberArk are essential to proactively reduce exposure and maintain operational resilience.

In 2025, GitGuardian released the State of Secrets Sprawl report. The findings speak for themselves; with over 24 million secrets detected in GitHub public repos, it is clear that hard-coded plaintext credentials are a serious problem. Worse yet, it is a growing problem, year over year, with 19 million found the previous year.

When we dig a little deeper into these numbers, one overwhelming fact springs out: specific secrets detected, the vast majority of which are API keys, outnumber generic secrets detected in our findings by a significant margin. This makes sense when you realize that API keys are used to authenticate specific services, devices, and workloads within our applications and pipelines to enable machine-to-machine communication. This is very much in line with research from CyberArk, machine identities outnumber human identities by a factor of 45 to one. This gap is only going to widen continually as we integrate more and more services in our codebases and with ever-increasing velocity.

Specific vs Generic secrets in the State of Secrets Sprawl Report 2024 findings

Secrets sprawl is clearly a problem for both human and machine identities, so why should we call out this distinction?

Machine identities

GitGuardian will be leaning into the term "machine identities" moving forward as a way to distinguish this area of secrets sprawl and its unique challenges apart from human identities and credentials. Each is problematic, but each calls for different approaches. We are following the naming convention from industry leaders in secrets management, such as CyberArk and analyst firms who define the industry, such as Gartner, in standardizing this terminology. Gartner defines the term in their 2020 IAM Technologies Hype Cycle report as, "Simply put, a machine identity is a credential used by any endpoint (which could be an IoT device, a server, a container, or even a laptop) to establish its legitimacy on a network." This term covers all API access keys, certificates, Public key infrastructure (PKI), and any other way possible to authenticate machine-to-machine communication.

Is a machine identity the same as a non-human Identity?

From a purely grammatical perspective, it must be a non-human identity if it is not a human identity. So why use the specific term machine identity? Well, practically speaking, a non-human could be a dog, a plant, or even a planet. When using the term "non-human" we must also necessarily further qualify what we mean, while the term 'machine identity' already has a widely accepted definition that narrows the scope to the secrets sprawl problem space.

For example, Venafi, a leading machine identity management platform, succinctly states, "The phrase “machine” often evokes images of a physical server or a tangible, robot-like device, but in the world of machine identity management, a machine can be anything that requires an identity to connect or communicate—from a physical device to a piece of code or even an API."

Types of Machine Identities and Their Security Implications

While the current discussion focuses on API keys as the primary form of machine identity, organizations must understand the full spectrum of machine identity types to implement comprehensive security controls. TLS certificates establish encrypted connections between machines, ensuring data transmitted over networks remains secure from interception. SSH keys authenticate machines for secure shell access, enabling secure command execution and file transfers between systems. Code signing certificates verify the authenticity and integrity of software, ensuring code hasn't been tampered with and originates from a trusted source. Cloud service identities manage access to cloud resources through service accounts and specialized tokens. Each type presents unique lifecycle management challenges. TLS certificates may have multi-year lifespans requiring careful renewal tracking, while cloud service tokens might rotate daily. Understanding these distinctions is crucial for managing machine identities effectively, as each type requires tailored security controls, monitoring approaches, and rotation strategies to prevent compromise and maintain operational continuity.

How did we get here?

Before we can talk about what to do about the issues of machine identities and secrets sprawl, it might be helpful to take a historical look at how we arrived at this point in the industry. In the early days of computer science, the only 'entities' we had to worry about accessing our machines and our code were humans. In the days of ENIAC or early UNIX systems, using a simple password and perhaps sturdy locks on the doors were really all you needed to ensure only the proper people could access a system. People love passwords, and we have for thousands of years. The Roman garrison used 'watchwords', which needed to be updated nightly, meaning we have been practicing manual password rotation for a couple of millennia now.

So, naturally, when it came time to implement machine-to-machine authentication, ensuring that we were only allowing access to trusted systems to recognize and communicate with one another, it was only natural we would turn to our old friend the password, in the form of a long and hard to guess token to get the job done. This system works okay until you remember the problem statement that started this article: we keep leaking these credentials into our code and into places around our code like Jira, Slack, and Confluence at an alarming rate.

Discovery and Inventory: The Foundation of Machine Identity Management

Before organizations can secure their machine identities, they must first discover and catalog every credential within their infrastructure, a challenge that grows exponentially with digital transformation initiatives. Machine identity management begins with comprehensive discovery across code repositories, configuration files, container images, and cloud environments where credentials often hide in plain sight. GitGuardian's secrets detection capabilities reveal that specific secrets, primarily API keys and service tokens, vastly outnumber generic credentials in real-world environments. This discovery process must extend beyond traditional IT assets to include ephemeral containers, serverless functions, and IoT devices that may contain embedded credentials. Organizations should implement automated discovery tools that continuously scan for new machine identities as they're created, ensuring the inventory remains current in dynamic environments. The inventory should capture critical metadata including credential type, expiration dates, associated services, and ownership information. Without this foundational visibility, automating machine identity management becomes impossible, leaving organizations vulnerable to the very secrets sprawl that threatens their security posture.

Solving both human identity and machine identity sprawl

Now that we have a common vocabulary and understand the two areas of concern, human and machine, what are our next steps? Let's start with human identities. People need to be able to authenticate to gain access to systems to get their work done. Using phishing-resistant MFA, preferably hardware-based, at every juncture where a human uses a password is a solid approach. Even if a password is leaked, it is much harder to exploit and gives the user time to rotate the credential. While not a silver bullet, Microsoft believes this could stop up to 99.9% of fraudulent sign-ins. Even better, if there is a way to eliminate that password, such as with a passkey using FIDO2 or hardware-based biometrics for authentication, then we should probably move in that direction.

Dealing with machine resources requires a different approach, as we can't just turn on MFA for machines. We also can't disrupt these machine identities, as the business of the enterprise is to do business, and the connections must continue to allow our systems to function and satisfy the availability leg of the CIA Triad. Similarly, we can not devote endless resources and hours to this issue, as new vulnerabilities in the form of CVEs, misconfigurations and licensing issues continue to be other areas security teams need to tackle.

In an ideal world, we could immediately move all of our systems to leverage short-lived certificates or JWTs that are issued at run time when needed and only live for the life of the request. Indeed, there are frameworks such as SPIFFE and its implementation, SPIRE, that can help organizations achieve this goal. While this is indeed a great approach, it has the real-world issues of developer adoption, development time and effort, and the overhead of running such services at scale.

While we can dream up many such ideal scenarios, we need to address the current situation head-on. Developers will continue to use machine identities, which can be leaked and exploited by attackers. At the same time, we know that if a malicious actor gets their hands on a secret, they can only leverage it if it is still valid. We believe the best practical solution for any organization is to rotate secrets much more frequently.

Implementing Zero Trust Principles for Machine-to-Machine Authentication

Traditional perimeter-based security models fail in today's distributed architectures where machine identities facilitate constant communication between services, APIs, and cloud workloads. Machine identity management must embrace Zero Trust principles, treating every machine identity as potentially compromised and requiring continuous verification rather than one-time authentication. This approach demands implementing mutual TLS (mTLS) for service-to-service communication, ensuring both parties authenticate each other using certificates rather than relying on network location or static tokens. Organizations should deploy certificate-based authentication with short-lived credentials that automatically rotate, reducing the window of opportunity for attackers who compromise machine identities. API security protocols like OAuth 2.0 with client credentials flow provide additional layers of verification for machine-to-machine interactions. Continuous monitoring becomes essential, tracking authentication patterns and flagging anomalous behavior that might indicate compromised credentials. By applying Zero Trust principles to machine identity management vendors and internal systems alike, organizations create resilient architectures where compromised credentials cannot easily facilitate lateral movement or unauthorized access to sensitive resources and data.

Automatically rotating secrets more frequently

One of the other stand-out findings from our State of Secrets Sprawl Report was the fact that of all the valid secrets we discovered in public, over 90% were still valid five days later. We believe this points to the fact that teams expect secrets to be long-lived and that the current manual approach to secrets rotation is hard. Further evidence of these conclusions can be found in breach reports involving companies such as Cloudflare.

In our Secret Management Maturity Model white paper, a clear differentiator in organizations in the Advanced and Expert categories is that they have adopted regular credential rotation policies. It is very unlikely these mature organizations are doing manual rotation, as that would be an overwhelming, time-consuming, and error-prone process, which potentially could mean disaster in our interconnected architectures.

Experts level of the Secrets Management Maturity Model

We need a way to automate the rotation process. The good news is that awesome tools are available, such as CyberArk's Conjure or AWS Secrets Manager, that make the process of auto-rotation pretty straightforward. Of course, this assumes all of your machine identities already and totally live within their system.

Auto-rotation of secrets first means knowing all your machine identities

Now, we could ask for every developer and infrastructure owner to give security teams a list of all their credentials in plaintext for all their various workloads, services, and devices, but obviously, that is a terrible and highly problematic idea.

In all seriousness, what is needed is a scalable end-to-end solution that can help you systematically and automatically find all the plaintext credentials inside of your code base, leaked out onto GitHub publicly, or even found in the communication tools that surround your code. Good news! GitGuardian makes exactly this. This is the heart and soul of our Secrets Detection Platform.

Automating the discovery and auto-rotation of machine identities with Brimstone

GitGuardian has partnered with CyberArk to offer a unique solution for security teams to detect machine identity leaks and manage their remediation effectively. We call this project Brimstone. This innovative integration allows communication between the GitGuardian Secrets Detection platform and CyberArk's Conjur, automatically addressing leaked machine identities across various critical scenarios.

"Unknown" machine identities. Known machine identities already exist in Conjur and need rotation or revocation, while unknown machine identities should be created there and then auto-rotated.
Known machine identities found in sources monitored by GitGuardian.
Publicly exposed machine identities on GitHub.com.

If you are already a CyberArk Conjur user, reach out to us to schedule a time to discuss how you can take advantage of this integration.

GitGuardian can help no matter where you store your machine identities

While we are very proud of our advanced integration with CyberArk, you can reap the machine identity discovery no matter where on your secrets management maturity journey. Taking that first step of understanding the scope of your issue is the best step in the right direction any organization can take to begin fighting secrets sprawl and better securing machine identities. Thanks to GitGuardian's dashboard and API, teams can quickly get a handle on tackling the problem of hard-coded secrets, machine identities, and human identities alike. And with ggshield we can help you eliminate the issue at the root, on the developer's machine

If you are struggling with machine identities, we invite you to schedule some time with us to explore what we can do together to make your organization safer while keeping your developers productive and happy. We are all in this together and we would be glad to work with you.

FAQ

What are machine identities, and how do they differ from human identities?

Machine identities are credentials—such as API keys, certificates, or tokens—used by non-human entities (servers, containers, applications, IoT devices) to authenticate and communicate securely. Unlike human identities, which represent individual users, machine identities enable machine-to-machine interactions and require distinct management practices due to their scale, automation, and lifecycle complexity.

Why is machine identity management critical for modern organizations?

Machine identity management is essential because machine identities vastly outnumber human identities and are frequently targeted in breaches. Without robust discovery, inventory, and rotation processes, organizations risk secrets sprawl, unauthorized access, and lateral movement by attackers. Effective management ensures operational continuity and compliance while reducing the attack surface.

What types of machine identities should organizations inventory and secure?

Organizations should inventory API keys, TLS/SSL certificates, SSH keys, code signing certificates, and cloud service identities (such as service accounts and tokens). Each type presents unique security challenges, including varying lifespans, renewal requirements, and usage patterns, necessitating tailored controls and monitoring strategies.

How can organizations automate the discovery and rotation of machine identities?

Automation begins with continuous scanning of code repositories, infrastructure, and cloud environments to discover all machine identities. Solutions like GitGuardian provide automated secrets detection, while integrations with secrets managers (e.g., CyberArk Conjur, AWS Secrets Manager) enable policy-driven, frequent rotation and revocation, minimizing the window of exposure for leaked credentials.

How do Zero Trust principles apply to machine identity management?

Zero Trust mandates that every machine identity is continuously verified, not implicitly trusted. This involves using mutual TLS, short-lived certificates, and continuous monitoring of authentication patterns. By treating all machine-to-machine communications as untrusted by default, organizations can prevent lateral movement and reduce the impact of compromised credentials.

What are the main challenges in scaling machine identity management across large, distributed environments?

Key challenges include discovering all credentials across diverse systems, maintaining an up-to-date inventory, automating rotation without service disruption, and integrating with multiple secrets management platforms. Fragmentation, legacy systems, and developer adoption further complicate centralized governance and automation efforts.

Securing Your Machine Identities Means Better Secrets Management

Dwayne McDaniel

Dwayne McDaniel

Machine identities

Is a machine identity the same as a non-human Identity?

Types of Machine Identities and Their Security Implications

How did we get here?

Discovery and Inventory: The Foundation of Machine Identity Management

Solving both human identity and machine identity sprawl

Implementing Zero Trust Principles for Machine-to-Machine Authentication

Automatically rotating secrets more frequently

Auto-rotation of secrets first means knowing all your machine identities

Automating the discovery and auto-rotation of machine identities with Brimstone

GitGuardian can help no matter where you store your machine identities

FAQ

Liked this article? Read more:

Katie DeMatteis

Dwayne McDaniel

Soujanya Ain

Anna Nabiullina

Start your journey to secrets-free source code

Machine identities

Is a machine identity the same as a non-human Identity?

Types of Machine Identities and Their Security Implications

How did we get here?

Discovery and Inventory: The Foundation of Machine Identity Management

Solving both human identity and machine identity sprawl

Implementing Zero Trust Principles for Machine-to-Machine Authentication

Automatically rotating secrets more frequently

Auto-rotation of secrets first means knowing all your machine identities

Automating the discovery and auto-rotation of machine identities with Brimstone

GitGuardian can help no matter where you store your machine identities

FAQ

Liked this article? Read more:

Related Articles

Identity Infrastructure: Why Credentials Are the Layer Directories Don't Secure

Katie DeMatteis

You Can't Secure What You Can't See: Making Non-Human Identities Governable

Dwayne McDaniel

GitGuardian Now Flags Admin and Overprivileged Identities Across AWS, Entra, and Okta

Soujanya Ain

Identity Access Management Strategy for Non-Human Identities

Anna Nabiullina

Start your journey to secrets-free source code