Why the Facebook outage and Twitch breach matter to business leaders
Josh Stella
2021年10月14日
0 分で読めますEditor's note
This blog originally appeared on fugue.co. Fugue joined Snyk in 2022 and is a key component of Snyk IaC.
This month, Facebook and Twitch both suffered serious damage at their own hands, and every executive needs to understand what happened and how these types of incidents are preventable.
At Facebook, a network configuration change took the service down for hours — and WhatsApp and Instagram along with it — resulting in tens of millions in lost revenue and millions of users unable to access these services.
At Twitch, an interactive live streaming service owned by Amazon, a server misconfiguration gave a hacker access to a trove of sensitive data, including information on users and source code for yet-to-be-released applications, which the hacker posted on the internet.
In the past year alone, 36% of companies suffered a serious cloud security leak or breach due to cloud misconfiguration.
We’ve seen enterprise cloud customers fall victim to their own preventable configuration mistakes many times before. What’s notable here is that Facebook and Twitch are essentially customers of their own cloud platforms. When you consider how much complexity the cloud providers have pushed to their customers, these incidents keep happening — not because people are bad at cloud security but because it’s really hard to get good at it. Let’s explore that.
Why Cloud Risk is Configuration Risk
The cloud attack surface is configuration, not the network. Configuration is essentially how you’ve designed and built your infrastructure. The word “configuration” can feel like a small detail, but in the cloud, configuration is a big deal. A mistake here can create vulnerabilities and break applications. A single misconfiguration can have a huge blast radius in terms of system downtime or a data breach — and the resulting loss of revenue and customer trust.
Take a car, for example. A car has an engine, a transmission, wheels, etc. All of these components have configurations, some of which are related to safety and regulated by law. People and machines inspected the configurations of the car before it rolled off the line, which the owner may have changed over time. A safety inspector flags configuration violations because bad configuration can cause a breakdown or an accident.
In terms of scale and complexity, an enterprise cloud environment is more like an aircraft carrier. It can contain hundreds of thousands of resources, each involving dozens of configurations. Cloud engineering teams are making dozens — or hundreds — of configuration changes every day. Back to the car analogy, this is like swapping in a new transmission while driving down the highway at 70 mph — without slowing down.
Clouds Change Constantly, and Every Change Brings Risk
The cloud is the most secure computing platform humans have ever produced — if you build it correctly and ensure changes don’t introduce vulnerabilities. That’s the hard part.
The constant state of cloud change plays such a crucial role for the modern enterprise’s success: speed and agility. Companies operating in the cloud generally realize a faster time to market than those operating in a data center. But all that change brings great risk. Humans are making configuration decisions every day and then changing them the next. How informed are those decisions when it comes to security?
Unfortunately, the answer is “not enough.” This is not meant to disparage software engineers. We ask a lot from them, and they produce great things for us. But humans are terrible at keeping thousands of data points — and thousands more rules — in our heads. No human can possess full knowledge of a cloud-based system and the security implications each change will bring. But full knowledge of your cloud environment — and denying your adversaries that knowledge — is essential to keeping it secure.
As cloud environments grow bigger and more complex, this problem will only get worse.
21st Century Armchair Hacking
The good news is that cloud security teams are becoming more aware of this challenge. The bad news is that we’re way behind the hackers, who have gotten very efficient at acquiring the knowledge they need to exploit cloud systems. They use automation to scan the internet looking for cloud misconfigurations they can use to access an environment. Once in, they leverage additional mistakes to discover resources, move laterally, and extract data without detection.
Twitch didn’t become aware of its breach until its data started showing up on the internet, and a single server misconfiguration enabled the hacker to breach data well beyond the domain of that one server. The same thing happened to Capital One a few years ago, and they’re widely recognized as being among the best at cloud security.
What Business and Security Leaders Can Do Today?
Every business and security leader operating in the cloud needs to be paying attention and asking questions. You can be far more secure in the cloud than in a data center and certainly more competitive. But just because you can be more secure in the cloud doesn’t mean you are today. It’s safe to assume you aren’t safe.
Here are five essential steps:
1. Know the State of Your Cloud Environment
You need to understand what your current cloud security posture is. Ask your cloud security team for a report on the state of your cloud environment as it is today — not six months ago or whenever your last audit occurred. This report should give you a complete picture of how infrastructure is configured and a list of vulnerabilities by severity. If they can’t provide this to you by noon, they don’t possess this knowledge. That should scare you, and acquiring this knowledge must be your team’s first priority. Everything in your cloud environment is knowable, so get your team on it.
2. Think Preventively About Cloud Security
Once you know where you stand, it’s time to start thinking differently about cloud security. Your security team might still be focused on things like intrusion detection and network monitoring to catch attackers, but that’s not how cloud security works. Once a hacker has gained access to your environment, it’s too late. Cloud breaches happen in minutes, and traditional security tools offer little or no help. Cloud security is about preventing misconfiguration vulnerabilities from happening in the first place.
3. Apply Policy as Code as a Force Multiplier
The only way to prevent cloud misconfiguration is to bake security automation into every aspect of cloud operations. Doing this requires Policy as Code, which enables your team to express security and compliance rules in a programming language that an application can use to check the correctness of configurations. Instead of humans inconsistently reviewing things and enforcing rules, Policy as Code empowers all cloud stakeholders to operate securely without any ambiguity or disagreement on what the rules are and how they should be applied.
4. Align Cloud Stakeholders
In this model, your security team becomes a tool vendor to your application developers and cloud engineers. Engineers developing cloud systems automatically check their work against policy and make corrections quickly and easily, before they build anything. Automated guardrails prevent the deployment of dangerous cloud vulnerabilities. And security teams continuously monitor your environment to catch any misconfigurations that slip through — before the bad guys can find them.
5. Shift Culture to Security-Everywhere
Transforming how your organization does cloud security will require some new hiring, training, processes and changes to your culture. These aren’t easy things to do, but until you do them, you will remain at great risk. By focusing on automation and Policy as Code, there’s no need to hire a virtual army of engineers, which is good news considering the bidding wars currently being waged over them. Automation and Policy as Code helps your security team invest in more challenging problem areas, and your application teams deliver secure innovation to the market faster.
開発者のために設計された IaC セキュリティ
Snyk を導入すると、統一された Policy as Code エンジンにより SDLC からクラウドでのランタイムまで IaC が保護されるため、すべてのチームが安全に開発、デプロイ、運用できます。