Note - This article was originally published on The Hacker News on February 23, 2023
A few years ago, a Washington-based real estate developer received a document link from First American – a financial services company in the real estate industry – relating to a deal he was working on. Everything about the document was perfectly fine and normal.
The odd part, he told a reporter, was that if he changed a single digit in the URL, suddenly, he could see somebody else's document. Change it again, a different document. With no technical tools or expertise, the developer could retrieve FirstAm records dating back to 2003 – 885 million in total, many containing the kinds of sensitive data disclosed in real estate dealings, like bank details, social security numbers, and of course, names and addresses.
That nearly a billion records could leak from so simple a web vulnerability seemed shocking. Yet even more severe consequences befall financial services companies every week. Verizon, in its most recent Data Breach Investigations Report, revealed that finance is the single most targeted industry worldwide when it comes to basic web application attacks. And according to Statista, successful breaches cost these companies an average of around six million dollars apiece. The IMF has estimated that industry-wide losses from cyberattacks "could reach a few hundred billion dollars a year, eroding bank profits and potentially threatening financial stability."
In response, executives are allocating millions more every year to sophisticated defense systems – XDR, SOCs, AI tools, and more. But while companies fortify against APTs and mature cybercriminal operations, security holes as rudimentary as FirstAm's remain rampant across the industry.
There's one category of vulnerability, in particular, that rarely comes up in boardroom discussions. Once you start looking, though, you'll find it nearly everywhere. And far more than zero-days, deep fakes, or spear phishing, it's quite easy for hackers to discover this kind of error and pounce on it.
A Vulnerability Everybody's Overlooking
In 2019, three researchers from North Carolina State University tested a hypothesis commonly understood but not often discussed in cybersecurity.
GitHub and other source code repositories, the story goes, have caused a boom for the software industry. They allow talented developers to collaborate around the world by donating, taking, and combining code into newer, better software built faster than ever before. To enable the different code to get along, they use credentials – secret keys, tokens, and so on. These connecting joints allow any bit of software to open its door to another. To prevent attackers from getting through the same way, they're protected behind a veil of security.
Or are they?
Between October 31, 2017, and April 20, 2018, the NCSU researchers analyzed over two billion files from over four million Github repositories, representing around 13 percent of everything on the site. Contained in those samples were nearly 600,000 API and cryptographic keys – secrets embedded right in the source code for anybody to see. Over 200,000 of those keys were unique, and they were spread across more than 100,000 repos in all.
Though the study accumulated data over six months, a few days – even a few hours – would have sufficed to make the point. The researchers highlighted how thousands of new secrets leaked every day of their study.
Recent research has not only supported their data but it's also taken it a step further. For example, in the 2022 calendar year alone, GitGuardian identified ten million secrets published to GitHub – more than five for every 1,000 commits.
At this point, one might wonder whether secret credentials contained ("hardcoded") in source code are really so bad if they're so common. Safety in numbers, right?
The Danger of Hardcoded Credentials
Hardcoded credentials seem like a theoretical vulnerability until they make their way into a live application.
Last Fall, Symantec identified nearly 2,000 mobile apps exposing secrets. Over three-quarters leaked AWS tokens, enabling outside parties to access private cloud services, and nearly half leaked tokens that further enabled "full access to numerous, often millions, of private files."
To be clear, these were legitimate, public applications used around the world today. Like the five banking apps, Symantec found all using the same third-party SDK for digital identity authentication. Identification data is some of the most sensitive information apps possess, but this SDK leaked cloud credentials that "could expose private authentication data and keys belonging to every banking and financial app using the SDK." It didn't end there since "users' biometric digital fingerprints used for authentication, along with users' personal data (names, dates of birth, etc.), were exposed in the cloud." In all, the five banking apps leaked over 300,000 of their users' biometric fingerprints.
If these banks have escaped compromise, they're lucky. Similar leaks have taken out even bigger fish before.
Like Uber. You'd imagine that only highly organized and talented cyber adversaries could breach a technology company of Uber's standing. In 2022, however, a 17-year-old managed to do it all on his own. After some light social engineering led him into the company's internal network, he located a Powershell script containing admin-level credentials for Uber's privileged access management system. That's all he needed to then compromise all sorts of downstream tools and services used by the company, from their AWS to their Google Drive, Slack, employee dashboards, and code repos.
This might have been a more remarkable story had it not been for the other time Uber lost secrets to hackers in a 2016 private repo breach that exposed data belonging to over 50 million customers and seven million drivers. Or the other time they did it, through a public repo, in 2014, revealing the personal information of 100,000 drivers along the way.
What to Do
Finance is the single most targeted sector for cyber attackers worldwide. And every researcher who drudges up thousands of vulnerable apps or millions of vulnerable repos demonstrates just how simple it would be for attackers to identify hard-coded credentials in the code essential to running any modern company in this industry.
But just as easily as the bad guys could do it, so too could the good. Both AWS and GitHub themselves attempt, as best they can, to monitor for leaky credentials on their platforms. Clearly, those efforts aren't enough on their own, which is where a cybersecurity vendor steps in.
Learn more about monitoring source code for secrets from one of our experts.