Chicago is famous for many reasons, including the Bears, a specific style of hot dogs, and of course, for giving the world skyscrapers. PHP is also known for legendary architecture, being the underlying language for 77.5% of the web via frameworks like Laravel, Drupal, and WordPress. Community members from all over the world, representing all those frameworks and more, got together for php[tek] 2023.
This was the 15th annual convention of PHP, where users shared knowledge and best practices for leveraging the language that came to define the internet over the last 28 years. There was a real sense of community at the event, summarized very succinctly in the day one keynote, "Let Go of Ownership," from Tim Lytle. He encouraged us to think about our code and the community as not things we own but instead as things we are entrusted to take care of over time. He said we should think in terms of stewardship, which is a word that sums the subject up nicely.
Over the three days of the event, speakers told their stories about working with PHP and the opportunities it has afforded them. They also dove into some highly technical topics, even showing how PHP itself is compiled. Multiple speakers also covered security and customer data compliance. Here are just a few highlights from the event.
API security is critical
In a hotel, you present your credit card and other form of ID to the front desk to prove you are who you say you are. They check you are authentic and expected. They then issue you a hotel key card to get into your room, the gym, and any other restricted areas. The benefits of the key card are that you do not need to constantly re-prove who you are with your complete ID and credit card at all times. The key cards also automatically expire and are easily replaceable.
In OAuth language, the front desk is the OAuth Authorization Server. The key card is your Access Token. Your room and all the other areas where you are allowed access with your key card are the system Resources.
This model achieves the main goals of OAuth:
- Delegation - Sharing access without sharing credentials.
- Scoping and Expiration - Granting limited access for a short amount of time.
- Separating policy decisions from enforcement mechanisms.
One crucial point that Keith noted is that OAuth itself does not specify how you do the authentication, just authorization. Authentication, often abbreviated as AuthN, verifies you are who you say you are. This is commonly achieved though opening a web browser and having you log in through another trusted service like GitHub or Google, relying on OpenID Connect. Authorization, abbreviated as AuthZ, is concerned with 'if' you are allowed to perform an action or access a resource.
You end up with a three-step security process where you prove who you are, AuthN, then get approval to reach certain resources, AuthZ, before finally accessing those resources by using the token the process provides.
Attackers commonly target each of these steps and the connections required throughout the process. It is vital to think through security at each of these vectors. This starts by always using HTTPS to prevent man-in-the-middle attacks. It is also important to scope any tokens appropriately, only allowing authorization for the resources required to complete the work. Tokens also need to be short-lived; the shorter the time to live, the better.
Keep your Webhooks safe as well
Keith also echoed a lot of these same lessons about security in his other talk at the event, "Webhooks: Lessons (Un)learned." Keith was responsible for the initial research that became the website webhooks.fyi. While investigating webhooks, he realized that every company does them slightly differently, but there are some underlying security concerns that we all need to be aware of.
It is vital to secure the payload itself. There are a number of ways to accomplish this, from having shared secrets or using OAuth, to much more secure methods like keyed-hash message authentication codes, HMAC, or mTLS, Mutual Transport Layer Security. It is also important to protect against 'replay attacks' by using timestamps. We are proud to say that GitGuardian Custom Webhooks make use of HMAC and Timestamps to keep our customers safe.
Back on the topic of APIs, Tim Bond talked about external threats in his session "Attackers want your data, and they're getting it from your API." He said APIs are everywhere, including, in the broadest sense, the front of your website.
The first step to securing your API is limiting the responses to only the data absolutely needed to make the app work. HTTPS should always be enforced, echoing what Keith said earlier in the event. He also encouraged using "certificate pinning," where you only accept specific, pre-approved certificates. If possible, he suggests enforcing dynamic integrity checking, as you can do through the Google Play store.
One way you can discourage attackers is by rate limiting. Hackers will often try to enumerate endpoints, especially around user IDs. Someone looking up `user/123`, `user/124`, then `user/125` in rapid calls is likely someone up to no good. Shutting them down should not interfere with legitimate business. Further, he suggested using Unique User IDs, UUIDs, so instead of sequential user numbers, each is assigned a long random number that is unrelated to other user IDs. For example, instead of `user/123`, making them `user/SINFKLDFDF51F` will make it harder for an attacker to guess what other user IDs could be.
Data privacy is the law
Data privacy laws are always evolving, and it can be tricky to keep up to date with the latest news. That is why we were all glad for the session "Data Privacy in Software Development" by Jana Sloane, an attorney at Microsoft. She was quick to state that this session was not giving legal advice but was intended to point us in the right direction to know how to talk to internal legal teams. Having those conversations early in the development lifecycle can help keep everyone compliant and safe.
Jana gave us a brief overview of today's data privacy landscape. In the US, every state has implemented its own framework. In the EU, it is a little clearer, thanks to legislation like GDPR, but she said there is a a lot of case law being worked out right now, so talking to legal teams earlier in the process can help you stay ahead of what is on the horizon. In addition to government regulation, software developers need to be aware of any contractual obligations their company must comply with. For example, ensuring your new feature or product will still fall within SOC II compliance is important so there are no surprises when you try to launch.
When thinking about access management, who can get our data, we need to ensure data is:
1. Necessary and proper - We are only collecting what is truly needed for the application to work.
2. Accessed by proper personnel - There is a clear log and authorization policy in place for anyone or any service that can obtain the data.
3. Used correctly - If you say exactly what you will use the data for in the terms of service, you must limit the use to only those purposes.
4. Retained accurately - Properly storing data means encrypting the data properly and thinking through geolocation issues, only storing it in places allowed by data sovereignty law.
Lastly, you should have a clear policy for how long you are allowed to keep use data. It should not be forever. Your policy should also allow the user to request for it to be deleted at any time. Any time you want to use the data for a new or different reason, you need to inform the customer and have them opt-in for the new use, letting them opt out of the system if they choose.
Scanning for better code
Scott Keck-Warren began his session "Reducing Bugs With Static Code Analysis" by telling the story of breaking live production websites when he tried to fix bugs on the live server. He quickly learned that there needed to be a way to test his fix before it got to the production machine.
His team moved to manual code analysis, which was a step up from breaking production, but was slow and error-prone. Human beings were still too involved in the process. His team moved next to dynamic testing. While this is much more reliable overall, it takes a while to run, reliable though. What they finally found that was both fast and reliable was a form of source code analysis or SCA, called static code analysis. This allows the code to be analyzed without needing to go through a build step and can save a lot of time and resources.
He found PHP-specific tools like PHPStan and PHP_CodeSniffer were a good fit for their needs, given the codebase was mostly PHP. He also is a fan and user of Rector, a tool that "instantly upgrades and refactors the PHP code of your application."
What made these tools truly successful for his org was consistent use, through automation. His favorite way of automating testing is through git hooks. We love git hooks at GitGuardian, as that is how you can leverage ggshield to prevent yourself from committing secrets.
We are also big believers in source code analysis, especially for security. This is why we have officially partnered with Snyk to help our users, and the world, strengthen developer security through SCA. While the tools Scott cited are excellent for debugging PHP code for functionality, Snyk can help any developer deliver more secure code no matter what language your company relies on.
Good software requires a SOLID foundation
When you think of approaches to building software, you might think of Agile, Waterfall, or even DevOps. However, there is a concept underneath all those approaches which deals with how to think about the code itself. Cori Lint covered this in her talk, "Building a SOLID Foundation."
The SOLID framework was introduced to the world in a 2000 paper from Robert C. Martin defining best practices for Object-oriented Programming, OOP. OOP is the predominant approach of modern software languages and frameworks.
SOLID stands for:
- Single Responsibility Principle - each component should have only one responsibility.
- Open-Closed Principle - OOP entities should be open for extension but closed for modification.
- Liskov Substitution Principle - objects of a superclass should be replaceable with objects of its subclasses without breaking the application
- Interface Segregation Principle - a class shouldn’t have to implement members that it does not need.
- Dependency Inversion Principle - High-level modules should not depend on low-level modules. Both should depend on abstractions.
Cori gave multiple examples of these principles, including a `PlayInstrument` class. One can imagine a class for plying instruments that implements the methods:
Let's imagine we try to use 'PlayInstrument' to play a violin. Violins can't toot() or pressKey(). Thus this class violates the Interface Segregation Principle, and we should find a better approach. You could do this by creating new classes to replace the generic `PlayInstrument` class, one for wind instruments and one for string instruments, and perhaps new ones for percussion. These new classes would be simpler and reusable, making the program ultimately more resilient and easier to implement in code.
A community supporting multiple communities
PHP is at the heart of the internet, taking the form of many frameworks and language behind many services. Just as the code is widespread and used in diverse ways, the community itself varies from security experts focused on APIs, to traditional website builders, to microservice architects. It is truly a global community, as we had folks from all over the world attend php[tek].
No matter where you are on the planet or what particular focus you have in your day-to-day work, security surely lies at the heart of it. We are proud to support developers, DevOps, and security teams as they work to make their code more secure by keeping their secrets secret. If you are not sure where your secrets are right now in your PHP, or any other code, sign up to get started for free for secrets detection and start automating the prevention process with ggshield.