Secrets leaks aren’t confined to the developer's domain; they seep into infrastructure like Terraform files, build automation tools through logs or binary artifacts, and even day-to-day communications in messaging systems.
This blog post provides a data-driven breakdown of where secrets have been discovered in recent years, with detailed examples highlighting the risks, starting from the developer boundary to the lesser-known infrastructure one.
Secrets in developer collaboration tools
At GitGuardian, our systems go deeper than just scanning code; we monitor the entire technology stack to detect secrets wherever they hide. Our data highlights the true extent of secret sprawl:
- For every 42 secrets found in code, 1 is found in ticketing systems, such as JIRA.
- For every 21 secrets found in code, 1 is found in collaboration tools, such as Confluence.
- For every 9 secrets found in source code, 1 is found in messaging platforms, like Slack.
These statistics illustrate that while the majority of leaks occur in code, a significant number still escapes into communication and project management tools. This highlights the urgent need for robust secret detection across all systems, not just in code repositories.
Source Code Repositories
Two miscreants snatched from the (...) GitHub code repo the keys needed to access its AWS S3 data stores. [0]
During the 2016 Uber data breach, attackers leveraged AWS credentials found in GitHub code repositories to access 57 million Uber users and drivers.
Environment Variables
The attackers’ success relied on misconfigurations in victim organizations that inadvertently exposed their .env files. It did not result from vulnerabilities or misconfigurations in cloud providers’ services. [1]
Attackers targeted hundreds of thousands of companies' domains to detect and download exposed .env files. Some of them contained AWS credentials that attackers used to gain access into these organizations' infrastructure. Then, they used various techniques to monetize these credentials by mining cryptocurrency, or running phishing campaigns.
Software Build Automation
Between 10-20% of Jenkins servers (...) were misconfigured. Almost all of the misconfigured instances also leaked one or more (...) sensitive information. [2]
Jenkins requires credentials to integrate with various external systems and services essential to the software development lifecycle. As a result, misconfigured Jenkins servers can expose highly sensitive assets, including core repositories, cloud infrastructure, and container registries.
Logs
(...) super-linter creates a log file with lots of details, including environment variables. CI/CD pipelines usually contain secrets loaded as environment variables — GitHub tokens included (...) [3]
Logs are essential for gathering insights into system behavior and diagnosing issues. However, they often inadvertently record sensitive information, such as API keys, passwords, or other secrets used by applications. When these logs are not properly secured, they can become a goldmine for attackers, exposing critical secrets that could lead to larger security incidents.
Binaries
SolarWinds left hardcoded credentials in its Web Help Desk product that can be used by remote, unauthenticated attackers to log into vulnerable instances, access internal functionality, and modify sensitive data. [4]
Reverse engineering techniques allow for a deep understanding of how a program operates, and in some cases, can reconstruct code that closely resembles the original source. This can reveal the underlying logic, algorithms, and even credentials stored in a binary.
Ticketing System
Researchers enumerated a list of 812 subdomains and found 689 accessible Jira instances, and found 3,774 public dashboards, 244 projects, and 75,629 issues containing email addresses, URLs, and IP addresses (...) [5]
Users sometimes inadvertently store credentials or sensitive information in JIRA comments or issue descriptions. This often happens due to the convenience of sharing information quickly, a lack of awareness about security risks, or insufficient training on proper data handling practices.
Messaging Systems
A hacker group called “NullBulge” says it stole more than a terabyte of Disney’s internal Slack messages and files from nearly 10,000 channels in an apparent protest over AI-generated art. The data allegedly includes every message and file from nearly 10,000 channels, including (...), login credentials (...) [6]
Sharing passwords via messaging platforms is often considered the fastest and most convenient way to share credentials. Unfortunately, this makes systems like Slack attractive targets for attackers, as demonstrated by the Disney leak carried out by the NullBulge group. Recent reports suggest that, after this breach, Disney transitioned to Microsoft Teams to take advantage of its end-to-end message encryption.
Data Storage
A folder in the bucket titled “Secure Store” contains not only configuration files for the Identity API, but also a plaintext document containing the master access key for Accenture’s account with Amazon Web Service’s Key Management Service, exposing an unknown number of credentials to malicious use. [7]
Storage buckets, like AWS S3 or SharePoint, are commonly used for document storage but can also hold sensitive information, including credentials. Cybercriminals frequently exploit misconfigured buckets to access and expose private data, such as passports, national ID cards, and credit card details.
Disk Images
This research unveils a quarry of sensitive data stored in public AMIs. Digging through each AMI, we managed to collect 500 GB of credentials, private repositories, access keys and more. [8]
On AWS, AMI (Amazon Machine Image) are disk images used to create virtual servers. Companies commonly use AMIs to create standardized, reusable master disk images, ensuring all their virtual servers are consistently configured. They are often used by companies to build master disk images to ensure that their servers are equally configured, including credentials. However, AMIs may embed credentials, API keys, or private SSH keys, which can inadvertently be left publicly exposed.
Infrastructure as Code
In order for Terraform to know which resources are under its control and when to update and destroy them, it uses a state file named terraform.tfstate by default. However, Terraform state files contain all data in plain text, which may contain secrets. [9]
Terraform uses a state file, named terraform.tfstate
by default, to track which resources it manages. However, these state files store all data in plain text, potentially including sensitive information like secrets. Storing these state files, whether in public or shared repositories, exposes secrets that can be discovered and used by attackers.
Conclusion
In summary, the message is clear: secrets sprawl not only in code. From code repositories in development environments to infrastructure tools like Terraform, to services like build systems and ticketing platforms, we have described 9 critical areas where secrets can leak. Detecting and fixing these leaks quickly remains critical to minimizing exposure, and acting before attackers exploit them into larger security incidents.
At GitGuardian, we’ve expanded our detection coverage significantly, but our work isn’t done. We’re continuously developing solutions to protect a broader range of tools and minimize your organization’s attack surface.
Your feedback plays a crucial role in shaping our efforts. If your team is using any tools that could be vulnerable to secret leaks, we’d love to hear about them.
Finally, remember that following security best practices is the most effective way to prevent secret exposure. Stay proactive, stay secure!