Protecting Your Software Supply Chain: Understanding Typosquatting and Dependency Confusion Attacks
Keshav MalikKeshav is a full-time Security Engineer who loves to build and break stuff.He is constantly looking for new and exciting technologies and enjoys working with diverse technologies in his spare time. He loves music and plays badminton whenever the opportunity presents itself. |
Software supply chain attacks occur when hackers manipulate the software development process to achieve their goals. Attackers often successfully breach a third-party service to implant harmful code into legitimate software or updates. This malicious code is then spread to unsuspecting end users during routine software installation or updates.
While these types of compromises are not new, the modularization of software supply chains and the increasing dependence on open-source software in development activities have propelled them into the spotlight in recent years. High-profile attacks such as Log4Shell and CodeCov have underscored the critical importance of securing software supply chains, particularly the open-source ecosystem. The system's innate openness, granting unbounded access to everyone, is a key attraction for those with harmful intentions.
Additionally, the corporate world's growing embrace of open-source software over the past ten years further enhances its appeal. It's estimated that over 85% of enterprises now utilize some form of open-source software. As a result, it's no surprise that open source has become a prime target for malicious actors, regardless of their motives.
This article seeks to demystify two types of attacks that seek to exploit open-source package repositories like npm, PyPi, Maven, and others: typosquatting and dependency confusion. We aim to equip engineering managers and security practitioners with knowledge of these attacks so they can secure their infrastructure with informed vigilance and proactive measures, ultimately contributing to creating a safer digital universe for us all.
Tip: Dive into our in-depth software supply chain security guide to gain a robust understanding of the field.
Understanding Supply Chain Attacks
Supply chain attacks are complex and may present in different forms. By targeting various elements within a supply chain, attackers can exploit vulnerabilities to gain entry to valuable data, intellectual property, or sensitive systems. Below, we explore these attacks step-by-step and present real-world examples to highlight this concept.
The Anatomy of a Supply Chain Attack
Supply chain attacks typically follow these stages:
- Target Identification: Attackers identify weak links in a supply chain, such as third-party providers with inadequate protection or commonly utilized open-source packages that can be leveraged as an attack vector.
- Exploitation: Through careful examination and investigation, attackers identify vulnerabilities in any software, hardware, or human part of the component.
- Infiltration: Attackers use identified vulnerabilities to penetrate their target systems and insert malicious software, embedding malicious code.
- Propagation: An attack could spread laterally through interconnected elements in a supply chain, providing further infiltration and control by attackers.
- Execution: Attackers then carry out their primary objective -— whether data theft, system disruption, or some other malicious purpose -— by carrying out their attacks in various forms.
Understanding Real-World Examples
Looking at these stages at play in real-life settings can provide invaluable insights:
Codecov Breach (2021):
- Target Selection: The attackers considered Codecov as their strategic target, an extensive code-testing application used in continuous integration by many organizations worldwide.
- Infiltration & Propagation: By exploiting a vulnerability in Codecov's Docker image creation process, skilled attackers were able to poison the artifact used by downstream users.
- Execution: The malicious image was leveraged to silently transmit all environment variables found within customers' Continuous Integration environments, including secrets, directly onto a remote server. This incident underscores the necessity of keeping clean Git repositories while using only non-production credentials within any CI environment whenever possible.
Event Stream Breach:
- Target Identification: The attacker identified "event-stream", a popular NPM package, as an attractive target.
- Infiltration and Propagation: Gaining publishing rights to one of the base packages of event-stream, an attacker introduced a malicious module as a dependency that targeted users.
- Execution: The breach demonstrated how attackers could utilize compromised packages within popular open-source libraries as delivery vectors for malware distribution.
Supply chain attacks have surged significantly, underscoring the sobering reality that no entity is invincible. The modular architecture that forms the backbone of application development today provides an ideal platform for attackers. It allows them to target not just one but hundreds, even thousands of organizations simultaneously. This paradigm shift has entirely flipped the economic model for these attackers. Now, they can allocate a larger pool of resources towards breaching the supply chain rather than focusing on a single company.
Let's now look in detail at two attack tactics that seek to exploit the popularity of certain open-source packages.
Focus on Two Types of Supply Chain Attacks
Typosquatting
Typosquatting (also referred to as package typosquatting in software supply chains) is a tactic capitalizing on typographical mistakes made by developers when pulling packages from public software registries. These attacks have vast and potentially severe ramifications that emphasize the necessity of strong validation mechanisms within package registries.
How does it work?
Package typosquatting takes advantage of typos, spelling mistakes, or any other kind of human error when naming or requesting packages. Here's a detailed look at how it unfolds:
- Malicious Package Publication: Attackers publish malicious packages in public software registries like NPM or PyPI, using names that are slightly misspelled or variations of popular legitimate packages, aiming to trap developers or automated systems that mistakenly request the wrong package.
- Mimicking Legitimate Packages: Most of the time, the malicious packages only differ by a few lines of code from the legitimate packages they are impersonating. They expose similar functionality and identical APIs, making it harder for developers to realize the mistake immediately.
- Automated Systems Trap: To get deployed, the malicious package relies on typing errors from engineers. Even if the package is only installed on a single workstation, the package can seek to steal credentials or any other sensitive data for further action. The worst case, though, is if the misspelled package name finds its way into an automated script: it means that each time the package gets pulled is a new opportunity to compromise the build or deployment process. It also means the attacker could progressively enhance the malicious package to increase its capacities.
- Execution of Malicious Activities: Once the malicious package is integrated into a software project, it can perform a range of harmful actions, including stealing sensitive data, inserting backdoors, or facilitating further compromises within the network.
Impact: The Potential Damage
Typosquatting can cause widespread harm. Let’s understand a few security issues posed by typosquatting attacks.
- Data Breach: Malicious software could contain code designed to exfiltrate sensitive information - from personal records or proprietary source code - leading to data breaches with devastating implications.
- System Compromise: Malicious packages may include backdoors into isolated environments that open doors to further exploits and compromises.
- Reputational Damage: Organizations affected by package typosquatting could experience irreparable reputational harm as it undermines trust in their ability to provide secure software environments.
Prevention and Mitigation
As a consumer of open-source dependencies, there are several steps you can take to prevent downloading malicious packages:
- Verify the package source: Always download packages from trusted sources. Stick to official package repositories or well-known sources that have a good reputation. Avoid downloading packages from unfamiliar or suspicious websites.
- Community feedback and reviews: Check for community feedback, reviews, and ratings of the package before downloading. Look for any reports of suspicious behavior or security issues. Engage with the open-source community to stay informed about potential risks.
- Double-check package names: Pay close attention to the spelling and formatting of package names. Typosquatting attacks often involve slight variations in package names that can be easily overlooked. Verify the package name against the official documentation or repository.
- Verify package integrity: Many package managers provide mechanisms to verify package integrity using checksums or digital signatures. Use these features to ensure that the downloaded package has not been tampered with. Compare the checksum or signature with the official source.
- Stay updated: Keep your package manager and dependencies up to date. Developers often release security patches and bug fixes to address vulnerabilities. Regularly update your packages to ensure you have the latest, secure versions.
- Use package manager security features: Some package managers offer additional security features, such as vulnerability scanning or package verification. Enable these features to add an extra layer of protection against malicious packages.
Dependency Confusion Attacks
A dependency confusion attack targets organizations or developers who use both public and private package repositories.
How does it work?
Dependency confusion occurs when attackers take advantage of how package managers prioritize public over private packages for installation. Here's how it works:
- Target Selection: Organizations often use private package repositories to store proprietary or internal packages. These packages may have the same names as publicly available packages in popular public repositories like npm, PyPI, or Maven. Attackers can identify such projects and create malicious packages bearing similar names to private dependencies used by their target, but hosted in public repositories.
- Infiltration: When developers use package managers like npm, pip, or Maven to install dependencies, the package manager must first check the private repository. But if the package is not found there, or the package manager is misconfigured, it falls back to public repositories. This leads to the unintentional installation of the attacker's package.
- Execution: Once the attacker's package is installed, it can execute malicious actions such as data exfiltration, remote code execution, or backdoor installation, compromising the security of the system.
Real-World Example: A Startling Discovery
Researcher Alex Birsan made a startling discovery regarding dependency confusion in 2021. He discovered a way to target various high-profile companies by uploading malicious packages using public repositories with names similar to private dependencies used by these businesses, then uploading these malicious dependencies as private dependencies used by these firms.
- Target: Various high-profile companies.
- Attack: Birsan discovered that he could upload malicious packages to public repositories with the same names as private dependencies used by these companies.
- Impact: By doing so, he demonstrated how an attacker could potentially infiltrate and compromise various corporate systems.
Preventing Dependency Confusion
To prevent dependency confusion attacks from occurring, organizations and developers can take several measures:
- Secure Package Management: Implement strong access controls and authentication mechanisms for private package repositories. Use package locking, pinning, or explicit versioning to ensure trusted installations. Regularly review and update package manager configurations.
- Package Naming and Scanning: Use unique and non-guessable names for internal packages. Scan public repositories for similar or typosquatting packages. Utilize tools for vulnerability scanning and package verification.
- Developer Education and Awareness: Educate developers about dependency confusion risks. Encourage secure coding practices, package source verification, and avoiding blind trust in package managers. Prompt reporting of suspicious behavior.
- Continuous Monitoring and Updates: Regularly monitor package dependencies and review sources and integrity. Stay updated with security advisories and vulnerability reports. Keep package managers and dependencies up to date.
- Secure Development Lifecycle: Incorporate security practices into the development lifecycle. Conduct code reviews, static analysis, and penetration testing. Follow secure coding guidelines and perform regular security assessments and audits.
Wrap-up
Typosquatting and dependency confusion are two common types of supply chain attacks that exploit open-source package repositories.
Typosquatting involves attackers publishing malicious packages with names that are slightly misspelled or variations of popular legitimate packages, tricking developers into downloading them.
On the other hand, dependency confusion occurs when attackers create malicious packages with names similar to private dependencies used by organizations, leading developers to unknowingly install them from public repositories.
To prevent these attacks, it is important to always:
- verify the package source
- check community feedback and reviews
- double-check package names
- verify package integrity with checksums
- stay updated with security patches
- and use package manager security features.
In addition, organizations should take several proactive steps to enhance the security of their software supply chains and protect against typosquatting and dependency confusion attacks. For example, they should establish robust access controls and authentication mechanisms for private package repositories, ensuring that only trusted individuals have access. It is also crucial to educate developers about the risks associated with these types of attacks and promote secure coding practices, such as verifying package sources and avoiding blind trust in package managers. Using unique and non-guessable names for internal packages is also a good strategy, making it more difficult for attackers to mimic them.
Lastly, organizations should integrate security practices into their development lifecycle, conducting code reviews, static analysis, and penetration testing, while also following secure coding guidelines and performing regular security assessments and audits. By adopting these proactive measures, organizations can fortify their software supply chains and create a safer environment for their development activities.