GitGuardian at AppSec Village: Honeytokens for the blue team

You might have already heard that GitGuardian went to RSA Conference 2023. Aside from all the excitement of the sessions, meeting the thousands of people at our booth in the Expo Hall, and all the amazing networking afterparties, the GitGuardian team was also part of the RSA Sandbox experience. We were excited to sponsor this interactive part of the conference that let participants get first-hand experience with a number of innovative technologies.

Specifically, GitGuardian sponsored and was part of the AppSec Sandbox, one of 8 different sandboxes, which you might hear referred to as villages at events like BSides or other hacker conferences. These other sandboxes included the Adversary Sandbox, Dark Arts Sandbox, LockPick Sandbox, and the IoT Sandbox, among others.

Over the course of three different sessions on Tuesday, Wednesday, and Thursday of RSA, we led dozens of people through a blue team exercise that demonstrated the power of using honeytokens for leakage detection. We were delighted to help so many people add this concept to their toolkits for cybersecurity defense. Participants even got a nifty badge for participating.

Badgelife #RSA #RSAC2023 Visit the Sandbox area to learn more! pic.twitter.com/wF1ipRMgBD
— David Batz @davbatz@mastodon.social 🌻🇺🇦 (@DavBatz) April 26, 2023

What is the AppSec Sandbox

The RSA Sandbox is one of two events managed each year by AppSec Village, an organization dedicated to "AppSec-focused CTFs, contests that challenge your mind and your skillz, and more." RSA is one of the two major events they operate at, the other being Black Hat.

Their mission is "to promote diverse voices and perspectives in an inclusive environment driven for and by the AppSec community to increase education and awareness of application security methods and practices."

Starting the last day of #RSAC2023 strong with a full house at @GitGuardian's "Hunt the Hacker" activity. @RSAConference in the #RSACSandox pic.twitter.com/RCu9zHjfHw
— AppSec Village (@AppSec_Village) April 27, 2023

The GitGuardian sandbox exercise

Our exercise simulated being on the 'blue team,' which, according to NIST is defined as a " group responsible for defending an enterprise's use of information systems by maintaining its security posture against attackers." Though our scenario was fictitious, it was based on real-world research on how security teams leverage cyber deception tools to find and stop leaks.

In our scenario, we were working for BigCorp, a made up entity that had multiple repositories in a GitLab self-hosted instance. There was an attack script that fired off once a minute that revealed, via log entries, that it was successful in getting our data from some of our repos. Your mission was to find which repos were in play, who or what was causing the breach, and cut off access. After that, you needed to clean up the affected repos.

We provided participants with a CLI tool, written specifically for this exercise, that combined various real-world tools' functionality into one interface. Based on the help menu, each person found their way to accomplish the mission and save BigCorp.

Given that we had only so many available machines, we ran this as a group exercise, where we asked the crowd what we should do next based on the tools and information at every stage. While only one person entered commands in the terminal, the rest of each assembled blue team collaborated much the same way you might in a real team setting, sharing opinions and bouncing ideas off each other.

Honeytokens to find the leaks

The first command we found in our CLI toolkit was honeytoken. This command generated a decoy AWS key, just like you can do with GitGuardian Honeytoken. The key did not grant any access, but if used, would show up in the logs we were monitoring. Each group figured out that they needed to place the generated honeytoken credential into each of the five available private repos on GitLab.

Each blue team had slightly different ideas on where the best place to put this decoy credential might be. The best part was each one was a correct answer! With honeytokens, you can add them to brand new files with real-sounding names, like file.c or bucket.py, or add them to existing files, for example, as an extra available variable in a .yml file.

The reality is there are very few wrong places to put honeytokens. Attackers do not manually look through files for credentials in most attacks. Instead, attackers are using scanning tools to quickly find hardcoded credentials, so the file names, or even how the decoy credentials are inserted into the file, don't need to be too convincing. One exception would be .txt and .md files, as some scanning tools outright skip those since they most often just contain example keys or passwords and don't affect production systems.

Automation for the win

Very quickly, each team realized that manually inserting each token was going to take quite a while. Fortunately, a quick look at the local directory uncovered a honeytoken-loop.py script that could generate the honeytokens, create a new file and then step through the insertion and git process needed to get these to all our repos in a matter of seconds. Thinking about honeytokens as one-off items can be effective when dealing with a single repo, but like with all other aspects of security, automation is your friend at scale. This was a driver behind making GitGuardian Honeytoken available through our API.

Once the honeytokens were inserted, the next attack by our script triggered 3 of them. Just as important as which honeytokens were tripped, our logs also told us exactly which three repositories were compromised. This is one of the real powers of this tech: to quickly inform you of someone getting at your secrets, allowing you to cut dwell times down to minutes.

Removing the attacker's access

Now that the logs told us exactly which repos to focus on, the next step was finding who or what exactly might be causing the issue. Using our CLI, we were able to list the GitLab users in our org and immediately noticed something strange. First, we had a user with very recent activity in two of the three repos. Second, the remaining compromised repo was not accessed by a person at all but instead by two access tokens, which are used to grant access to services and APIs.

Throughout the three days we ran this exercise, every team wanted to go after the suspicious human first. Checking the admin interface of GitLab, we quickly discovered that this suspicious person was marked as 'external,' adding to our hunch this might be our culprit. An external actor can gain access through any number of ways; perhaps they were a subcontractor or, as was the case with CircleCI, an attacker compromised a remote employee's access and granted themselves access. Every group made the call to block this suspicious character, and suddenly, our attack script was less successful on the next run; this external user was indeed part of our breach.

The access token situation took a little more thought. There were two tokens present. While your gut instinct might be to revoke 'all the things,' this is not really a great idea in the real world, as you do not want to break anything in production unnecessarily and have devs and ops mad at you. Fortunately, we were able to use timestamps to easily tell which access token was used when the honeytoken was triggered. Revoking just that one meant our attack script was no longer able to access our repos! Success for this part…but we are not quite done yet.

Cleaning up our secrets

The last part of the exercise focused on the reality that even though we cut off access, the attacker(s) still had access to all code from those repositories. Who knows where they shared those files? The next step was to find out what exactly was in those files.

Rather than open up each file in each repo and read through the source code, something that would be impossible to do if hundreds of repos were at play, we can use known attacker standard operating procedures against them. Attackers scan for secrets in code they access because, at the end of the day, most bad actors are motivated by money. What they really want is your data to ransom or sell and/or your machine resources so they can mine cryptocurrency. The fastest way to acquire these resources is by finding the keys hardcoded in plaintext in your environment.

The blue teams used the provided local scanning tool and indeed found hardcoded credentials in the source code that had been compromised. Almost every single participant's first instinct was to immediately rotate those secrets.

Opening the affected files revealed a hint that the security team had left in code comments. Instead of just pasting in the new credential, a helpful message explained any and all credentials should be accessed through the GPG-based vault tool BidCorp was using.

A quick note: we chose to build a demo vault tool for this sandbox exercise for the sake of simplicity, but it was very heavily influenced by HashiCorp's Vault, where secrets are properly encrypted while stored and are accessed at runtime by calling the credentials programmatically. This method never exposes your credentials anywhere publicly. It is one of many such solutions on the market, among other offerings from Doppler and Akeyless.

Once the credentials were properly rotated and the code modified to call the vault rather than hardcode them in plaintext, we could celebrate!

In about 20 minutes, we went from not knowing which repos were affected to kicking out the attackers. We also took the needed steps to ensure any exfiltrated secrets were rendered useless to anyone who possessed them. And we even fixed our code to make sure future attackers would not find anything they could exploit!

Honeytokens for intrusion detection

While this was a sandbox exercise, it reflects how important a role honeytokens can play in your security posture in the real world. Elsewhere at RSA, in the keynote "The State of Cybersecurity – Year in Review" Mandiant/Google Cloud CEO Kevin Mandia included honeytokens as a must-have for detecting intruders when other products don't stop them. He referred to them as "account bait," which you can see in the linked video and in the tweet below. We could not agree more.

Google just said to make sure and have Honeytokens in place for your stack security. @Gitguardian makes that easy. Come talk to us at our booth on the far north end of the Expo hall at #RSAC #RSAC2023 pic.twitter.com/mmUAIKm5WH
— @mcdwayne@mastodon.social Dwayne McDaniel (@McDwayne) April 26, 2023

If you are already a GitGuardian customer, you can get access to Honeytokens right now by finding the new Honeytoken icon in your dashboard or by contacting us for access. If you still need to sign up for GitGuardian Secrets Detection, you can sign up now for free to get a handle on where hardcoded credentials appear in your code. Your account will also give you access to ggshield, our CLI that can quickly find secrets in your local repos and help you keep secrets out of your code when you try and commit them.

No matter what team you work on in your company, GitGuardian is here to help you stay safe, and now with GitGuardain Honeytoken, helping you fight back against attackers faster!