In December of last year, cybersecurity agencies from multiple nations (USA, UK, CA, AU, NZ) collectively put out a document called "The Case for Memory Safe Roadmaps." While memory safe programming languages are not GitGuardian's normal topic of discussion, it's an important security issue and should be understood.
First, a quick explanation of memory safe vs. memory unsafe programming languages. In memory unsafe languages, the developer is responsible for manually allocating and deallocating memory, which can lead to leaks, dangling pointers, and other bugs. And without automated bounds checking, they're more vulnerable to buffer overflows and other exploits.
Operating systems, device drivers, embedded software, and more are often written in C++ to give the developers very precise control, get "close to the metal," and operate as fast and lean as possible.
Memory safe languages include some of the most popular programming languages in the world: Python, Java, C#, Go, Rust, and Swift. JavaScript, which powers most web sites on the front end and is used as a back-end language using Node.js, is a mixed bag when you ask about memory safety. It depends on the runtime engine and environment. Especially in the browser, there are ways to create memory leaks with bad management of DOM objects.
Given the speed and tuning abilities of a language like C++, why are all these security agencies recommending moving away from it?
Memory issues are a major area of vulnerability
That may seem like restating the obvious, but two-thirds of vulnerabilities identified for memory unsafe languages are related to memory issues. This can be found in a blog from the USA Cybersecurity and Infrastructure Security Agency (CISA) that pleads for developers to adopt memory safe programming languages.
In real world numbers, they cite Microsoft stating that around 70% of their CVEs relate to memory issues. The same goes for Google with the Chromium project that underlies not just the Chrome browser, but Microsoft's Edge, Opera, and more. Mozilla, developer of the Firefox browser, is quoted as stating that 94% of their critical/high-rated vulnerabilities were memory related.
Memory safe programming languages are more than "good enough"
In 2022, the Linux Kernel officially began supporting kernel modules written in Rust. That's not minor. Linux runs on just about anything these days and is the base kernel for all Android devices, including smartphones, tablets, smart TVs, cars… While a person who runs Linux on their laptop or server might not consider Android to be a Linux operating system because it lacks most of the utilities and features of a traditional distribution, the kernel is Linux.
Both Rust and Go have been engineered to provide nearly C++ speeds and the three are the subject of a lot of discussion around performance. While one may beat the others in a specific benchmark, when multiple benchmark tests are taken into account, it's a toss-up with no language winning all of them. That does not mean you should immediately dump C++ unconditionally, but it's important to understand why you need it and if it will be superior enough for your specific purposes to assume its risks.
HuggingFace's tokenizers
AI library is written in Rust with bindings for both Python and JavaScript. Python is popular for AI because it's easy to learn. While developers are writing AI code in Python, thanks to libraries like tokenizers
, Python is more like a supervisor assigning the hardest work to the hardest workers (the libraries), which allows for very high performance.
Should you switch to a memory safe language?
If you're using C++ and are considering adjusting your roadmap to adopt a memory safe language, you'll have to consider multiple factors:
- Which language is best suited to your existing and planned projects.
- What tradeoffs you'll have to make.
- Whether to port existing projects to the new language or just use it for new modules and new projects.
- The cost of getting your developers up to speed on the new language.
- Providing your developers with the right productivity and security tools, such as software composition analysis (SCA) tools that will help you validate the third party dependencies you use from package managers like PyPi (Python) or NPM (Node.js).
The one thing you won't have to worry about is whether GitGuardian will still work for you. We can detect hard-coded secrets in almost every coding or markup language.
Memory safety is an important consideration, because the lack of it in languages like C++ is a big source of vulnerabilities. Continuing with memory-unsafe languages won't necessarily introduce new bugs, but it increases the likelihood they'll occur (or may already be there, but undiscovered). Memory-safe languages won't guarantee you write error-free code, but with less worrying about memory issues, you'll have more overhead to deal with other security concerns… like sprawling secrets.