November 2, 20170 mins read
If there is one area security as a whole can be improved on, it’s the reputation of it slowing down an organization.
One of the biggest culprits of this reputation is ‘triaging’ — the process of validating if a security alert is actually impacting your organization, sizing up the estimated impact, and figuring out how to resolve it.
In this article, we will examine why triaging can quickly become a bottleneck for organizations and make the case that we should all be striving to skip triaging and focus on fixing vulnerabilities.
An exemplary triaging workflow
Let’s try to envision how a triaging workflow for an incoming security vulnerability alert might look.
I’ll take for example the recent RCE (Remote Code Execution) vulnerability in Apache Struts that was ultimately exploited in the Equifax case. One can imagine how a company of that size might address such a vulnerability:
A security personnel gets an alert that the vulnerability exists in the first place. This assumes she is subscribed to relevant mailing lists, or uses tools that would alert her to this vulnerability (and that she reads those alerts too)!
This security personnel would ask one of the engineers to produce a report of all the applications in the company’s portfolio which are using Apache Struts. This by itself is a non-trivial undertaking, as inventory/composition management is a very challenging endeavor, especially for companies that have been in business for a long time. This would be exacerbated for companies that have been exercising mergers and acquisitions and have bought multiple teams using multiple tech stacks over the years. Nevertheless, let’s assume the organization was able to produce a report of 100 projects that use Apache Struts.
The security personnel would then file 100 Jira tickets with severity ‘critical’ and activate the company’s all-hands-on-deck policy to get the issue addressed immediately.
100 engineers from various places in the org would need to be tracked and get assigned to fix this vulnerability.
Now let’s try to imagine how things look from the developers perspective. Most likely the developer is working on some high-value feature and wants to crank it out quickly to hit some top business milestone or deadline. Nevertheless, this is an important security alert, so the developer will have to address it. Here are the steps they would need to take:
Read about the particular vulnerability.
Read and become knowledgeable about Remote Code Executions, and the class of the vulnerability itself.
Try to figure out if the vulnerability impacts their particular application. Perhaps try to locate the actual exploit and confirm whether it is applicable or not.
Research what the possible fixes are for this vulnerability.
Apply the fixes.
Confirm they removed the vulnerability.
That’s a lot of work, and some of it requires real security expertise. The entire triaging workflow could take days, weeks, or even months depending on how heavy the internal workflows for your company are. And that’s not good.
Can we build software to help with triaging?
Probably! With some code instrumentation and machine learning, we could build a system that lets you know if there are code paths that are using the vulnerable method in the libraries carrying the vulnerabilities. If done accurately (i.e., low false positives and false negatives) that would reduce some of the triaging steps for the developers. However, even assuming we prove the vulnerability is exploitable in our context, the developer would still be tasked with figuring out the remediation!
Often more dangerous is when tools suggest to us that in the current context there might not exist data flows that allow for exploitation. In these cases, many organizations suppress or ignore the alert, and this is where the danger lies.
The fact that the code might not have vulnerable data flows right now doesn’t mean it won’t tomorrow. A method that isn’t called today, might be called after the next developer commit. The next developer in line will likely have no way to know that there is a vulnerable method hiding in the library they are using and that the alert for it was suppressed because that library method wasn’t called until now. Building your security posture on the invariant that the method isn’t called today is very close to reckless.
Looking at the big picture, triaging is just a precursor to fixing the vulnerabilities. It is used since fixing vulnerabilities is perceived to be hard and a big undertaking. But what if instead of using software to help to triage, we bypass all that and use software to automate fixing?
Snyk alert pull requests FTW!
When you add your projects to Snyk, we keep an accurate and continuous inventory of all your dependencies. That’s why when an important vulnerability like the Apache Struts RCE is released we can alert you in real time about its existence. More importantly, we can tell you exactly which of your applications is carrying the vulnerability. When you connect Snyk to your source code manager (like Github, Gitlab or BitBucket) we can do even better than that and send a pull request with the fix itself directly to your affected repos.
All you need to do to remove the vulnerability from your code is approve the pull request. Remediating the vulnerability is the right action to take, regardless of whether that code is being executed or not by your data flows today.
Not only can the entire triaging flow be replaced with a single “accept” button of the pull request, but it also reduces the level of security expertise inside the organization needed to execute fixes. Security made simple and fast—that’s pretty cool.