Pasadena, California, is famous for the Rose Bowl Parade and its many incredible views of the San Gabriel mountains. Science fans probably know it best as the headquarters for NASA's Jet Propulsion Laboratory. It is also home to the largest community-run open-source and free software conference in North America, The Southern California Linux Expo. This year, 2024, marked the 21st installment, or as they refer to it SCaLE 21x

SCaLE draws thousands of participants and more than 200 speakers who gave over 270 sessions. This four-day-long event offered multiple workshops, CTFs, and even a career day. It is so large that it fills two entire buildings of the Pasadena Convention Center. SCaLE actually shared part of the convention center with the television show "America's Got Talent," filming there for a couple of days. It made for an interesting mix of folks in the area, but all were there in good spirits.

It would be impossible to write about every session and workshop. For example, summarizing John Willis's AI workshop would take an entire book, which, fortunately, he is writing. It would also be hard to summarize Akuity's workshop, which helped over 100 people get some hands-on experience with ArgoCD and GitOps.

Here are just a few highlights from SCaLE 21x. 

A conference of conferences

What really made SCaLE 21x so massive is that multiple other community events leverage the conference space by co-locating their events. Each of these smaller community events had its own keynote speaker and schedule. Some of these sessions were only one day, but some spanned several days of the event. These co-located events included:

These diverse organizations sharing the same space meant there was a lot of community crossover during hallway conversations and networking events. For example, this was my first exposure to the NixOS community, which is an amazing community working to solve dependency management in an interesting, declarative way.

UpSCaLE added some real fun to the program by featuring Forrest Brazeal, the musical DevOps performer who delivered live renditions of his internet hits "The Re-Org Rag" and "168 AWS Services in 2 minutes."

Forrest Brazeal performing at UpSCALE

Understanding the risks that come with change

Legendary analyst Michael Coté delivered two talks, a very short "State of DevOps" and "We Fear Change" at DevOpsDays LA. In the first talk, he quickly shared that he sees the state of DevOps as "fantastic" overall as it is so widespread. The next big challenge is defining our evolutionary goals and staying positive about DevOps itself.

For the longer portion of his keynote presentation, he addressed why we are confronted with change in our organizations so often and why so many of us resist it. One of the major drivers of change is the turnover of upper management. On average, a company gets a new CIO every 3.7 years—in some cases, much more often. Each new leader is brought in specifically because they have a vision of how to do things better, more efficiently, and with a better ROI, which likely means some major changes or a re-org.

From the view of upper management, it is a high-risk proposition to "do nothing" and just let the company run as it has since companies need to grow and innovate continually. However, the rest of the staff often have very different incentives and motivations than upper management. From a staff worker's position, doing things the same way brings stability and predictability to the job. To them, the higher-risk move is to re-org or try a radically new approach. Once viewed from this perspective it is easy to see why your team might not be as excited as the new CISO about a security process overhaul.

Michael advocated for change to be worker-led and for management to better align changes with business outcomes. He said if a high-level manager did the day-to-day work of most roles, they would approve changes much faster, as they would understand what is really needed.

Those same managers need to communicate their real business goals when they request change. For example, pushing for more frequent deployments without giving a reason is not going to be well-received by people needing to push code. However, if framed in context, such as more frequent deployments are going to increase in-app orders by 65%, it should better get everyone on board with the real goals. If we all get on the same page about the need for change and the desired high-level outcomes, we are less likely to fear change. 

Day 2 of #SCaLE21x kicks off with #DevOpsDays LA keynote from Michael Coté Presenting both 1. The State of DevOps 2024 And 2. We Fear Change

Finding a balance in communication styles

In one of the ignite talks at DevOpsDays LA, Paul Tevis, executive coach at LetsImprove, presented some of the core concepts of Westrum's Generative Culture in his session "Three Heuristics for Fostering High-Trust, Generative Culture." Sociologist Ron Westrum defined a 'generative culture' as a performance-oriented organization focused on a mission. It is evidenced by high cooperation, shared risks, cross-functional collaboration being encouraged, and failures leading to inquiry for root causes instead of individual blame. 

Paul said that no matter the organization, information mostly still flows in a one-to-one fashion, with conversations needed to get meaningful work done. He introduced us to some simple heuristics, or rules of thumb, to help us all grow a generative culture while having these discussions. 

  1. Clear communications. Conversations can be too vague, where one party is not forthcoming with any details, or too rigid, where one party wants to dictate exactly how everything needs to be done. If we focus on being clear with our requests while leaving ourselves open to feedback on how it could be done, we are going to be able to do more together. This means using phrases like "What we need to know is..." or "What I need from you is…" more often.
  2. Being curious. If the other person in a conversation never asks for your perspective or feelings, then they might seem not to be interested in what you have to say. On the other hand, sometimes people can ask so many questions it feels like an interrogation. What we need is a curiosity about the other party without overdoing it. Using phrases like "What did you notice about..?" or "What do you recommend?" is a sign you are on a good path.
  3. Feeling connected. If someone is only concerned with the outcome and saving their own time, then they might say things like, "Bring me solutions, not problems." On the other end of this spectrum, they might be too invested or 'glued' to your feelings, overly concerned that anything they say might be upsetting. We do need to acknowledge each other's feelings but also about shared outcomes. Asking, "How can I help?" or saying things like, "Thank you for bringing this up; it seems like it is important to you." show the right amount of empathy for a healthy connection.
Three Heuristics for Fostering High-Trust, Generative Culture by Paul Tevis

Learning from chaos scenarios

You can learn a lot from running through your incident response playbook. Mandi Walls, DevOps Advocate at PagerDuty, in her session "Plan for Unplanned Work: Game Days with Chaos Engineering," shared some of the best practices her team has developed over the years, including why many teams tend to avoid them: nobody wants to break more things and have to respond to more incidents. 

She said these incident response exercises needed to be created with a specific purpose in mind. You can experiment with meaning, but ultimately, you want to prepare for when things break while also learning how to build more resilient systems. It is important to set goals, such as 'have the new person restore from an outage with the latest backup' or 'test the system reboot process to ensure it is still valid." 

While you can't predict the timing of a real-world event, you should avoid surprising your teams or customers with an incident exercise. Make sure you establish a clear communication plan for any possible disruptions. She also reminded us that tests can be very short if properly organized and communicated. She showed one example where the "game-day" activity started at 1:07 p.m. and was completely finished by 1:34 p.m.

Just as important as running these game-day scenarios is communicating the outcomes beyond the individuals who were involved. Each exercise needs a post-mortem phase where the learnings are collected and then properly shared among the appropriate teams. She suggested regular "lunch and learns" to share the findings and gather feedback. 

Plan for Unplanned Work: Game Days with Chaos Engineering - Mandi Walls from PagerDuty

Safely evaluating open source components

"According to recent research, 96% of all modern codebases contain some open-source components." This fact, shared by Katherine Druckman, Open Source Evangelist at Intel, did not really surprise us, especially at a Linux community event. However, many of us were taken aback at the rate new open source components are being released, with over 4 million releases of just npm packages in 2023 alone. Those staggering numbers mean it is harder than ever to keep up with security in our codebases. Throughout her talk "Secure Consumption of Open Source Software: Evaluating, Utilizing, and Contributing Safely," she shared some great tips for picking secure packages and some amazing open source tools offered by the Open Secure Software Foundation, the OpenSSF.

She asked us: "Which is more secure, a project with no CVEs or a project with dozens of known vulnerabilities?" The one with no known CVEs is very suspect, as it likely means it is suffering from a false negative problem, and no one is checking for vulnerabilities. If vulns are known, then they can also be patched. 

When evaluating software, the first things Katherine suggested we look for are if the project still has an active maintainer and when the last commit was made. She also looks at the issue queue to see if improvements are still being made. She said a very healthy sign for a project is a contributor guide. If the project has one, then it is much more likely they will have active community engagement.

While she did discuss multiple tools and guides, the one that garnered the most reactions and questions during her talk was the OpenSSF Scorecard. This open-source project can help you evaluate package security quickly across 19 different factors, such as "Does the project practice code review before code is merged?" and "Is the project at least 90 days old, and maintained?" The OpenSSF Scorecard is used by projects like TensorFlow, Angular, and Flutter to ensure their official releases are as secure as possible.

"Secure Consumption of Open Source Software: Evaluating, Utilizing, and Contributing Safely" from Katherine Druckman, Open Source Evangelist at Intel

The future of machine identities is not more API keys

In his session "Solving ‘secret zero,’ why you should care about SPIFFE!" Mattias Gees, Director of Tech at Venafi, laid out a problem we are all too familiar with at GitGuardian: If you rely on API keys to manage machine-to-machine authentication, you are one leak from an attacker gaining access to your systems. 

Machine identities outnumber human identities by around 45 to 1 in most organizations. While Oauth and OIDC have helped us a great deal on the human side of authentication, we, as an industry, are still struggling to secure machine and workload identities in a scalable way. 

He reminded us that this is a very old problem, even going back to Aristotle, who said: "An entity without an identity cannot exist because it would be nothing. To exist is to exist as something, and that means to exist with a particular identity." For workloads in Kubernetes, we need to know what each service is and how we can confirm this is true. That is the motivation behind SPIFFE, the Secure Production Identity Framework for Everyone.

Created as a standard at Google in 2015, SPIFFE is a way to achieve zero-trust security for machine entities across applications and multi-cloud deployments. SPIFFE is a set of principles, and those principles are implemented as SPIRE, the SPIFFE Runtime Environment, an open source API-based toolchain for establishing trust between software systems across a wide variety of hosting platforms.

Mattias walked us through the 5 components of SPIFFEE:

  1. A SPIFFE ID - A common protocol and path that is unique to each service. For example, it might look like spiffe://venafi.com/dc1/node10/frontend/webserver
  2. 2. SPIFFE SVID - Like a passport, this is a signed cryptographical key, such as a x.509 certificate or a JWT, which has a short lifespan and is passed through a mTLS secured connection.
  3. SPIFFE Workload API - This is the server-side 'border control' that requests an SVID. This is implemented through a SPIFFEE agent running on each server.
  4. SPIFFE workload attestation -  An asynchronous way to validate any certificate delivered by an SPIFFE API request. It requires a central service running that can be accessed by all the server instances.
  5. 5. SPIFFE federation - An advanced way for multiple SPIFFE servers to communicate across domains and networks. 

Ideally, any new project could just adopt SPIRE, and you would never need to create API keys. In reality, the adoption process is going to be incremental, and you will need to live with some workload identities in SPIRE and some with other traditional approaches. The good news is that many secrets managers can work with SPIFFE, such as Vault by Hashicorp and CyberArk's Conjur, meaning the next time you need to rotate a credential, you could start implementing SPIRE to eliminate secrets and secrets sprawl. 

"Solving ‘secret zero’, why you should care about SPIFFE!" from Mattias Gees, Director of Tech at Venafi at #Scale21x

Much more than just Linux at SCaLE 21x

While The Southern California Linux Expo literally contains the word "Linux," SCaLE is really about all things open source and the folks who make those projects a reality. I spent most of my time in the security track, where I was able to premiere my newest talk, "Security In An IaC Defined World," and at DevOpsDays LA, but there was so much more to do and see. It was also an event with many familiar faces I have known from the Drupal, WordPress, and PHP communities, as well as multiple advocates from partners like Datadog and Doppler

If this event isn't already on your calendar, be on the lookout for the announcement for SCaLE 22x soon, and hopefully, we will see you there.