Guide to Software Composition Analysis (SCA)

2020 was a watershed year for open source. Digital transformation, already gaining momentum before COVID19 hit, suddenly accelerated. More and more companies became software companies, and with this shift—usage of open source peaked. Why?

Simply put, open source enables development teams to deliver value more rapidly and more frequently, thus enabling their companies to better compete in their respective markets. 

This growing reliance exposes companies to both a security and a legal risk stemming from the open source dependencies used to build applications. These dependencies may contain known vulnerabilities, which—if exploited by malicious parties—can result in significant economic loss. The open source licenses these dependencies contain dictate usage terms which, if violated, can also result in hefty fines and a reputation loss. 

In comes Software Composition Analysis (SCA), an application security testing method that helps development and security teams to successfully manage and mitigate this risk. 

What Is a software composition analysis (SCA)?

Software Composition Analysis (SCA) is an application security methodology for managing open source components. Using SCA, development teams can quickly track and analyze any open-source component brought into a project. SCA tools can discover all related components, their supporting libraries, and their direct and indirect dependencies. SCA tools can also detect software licenses, deprecated dependencies, as well as vulnerabilities and potential exploits. The scanning process generates a bill of materials (BOM), providing a complete inventory of a project’s software assets.

SCA in itself is not new, but the growing adoption of open source over the past few years has made it a key pillar of application security programs. As a result, SCA tools have proliferated. But not all SCA solutions are born equal. Modern software development practices, including the notion of DevSecOps, require that an SCA be developer-first—providing development teams with developer-friendly tooling, on the one hand, and security teams with the ability to guide developers so they can embrace security throughout the SDLC, on the other.

SDLC software development lifecycle and software composition analysis
SCA tools should enable developer teams to embrace security throughout the SDLC

Why use a Software Composition Analysis tool?

Open source components are becoming major building blocks in software across practically every vertical. SCA helps keeping track of open source components used by your applications, which is critical both from a productivity and a security standpoint.

The cost of a breach

Gartner estimates that more than 70% of applications contain flaws stemming from the use of open source. As the case of Equifax shows, exploitation of these flaws can result in disastrous results for an organization.  

The vulnerability that led to this famous breach – a vulnerability in a very popular open source Java library by the name of Apache Struts – was known since February 14, 2017. A fix was released three weeks after that by Apache, and only one day passed before an exploit was made available. An additional two weeks passed between then and when attacks began to peak. If Equifax had found and fixed the issue within that window between the release and the attacks, they would have been protected. With the benefit of hindsight, we now know that they failed to do so. The cost for Equifax was high –  a huge lawsuit and subsequent unprecedented settlement, as well as a substantial hit to the brand’s reputation and credibility which cannot be overestimated. 

The Equifax breach was a watershed moment for the security industry and application security in particular as it highlighted the importance of having controls in place to ensure the risk introduced by the open source being pulled in by developers is managed. The breach demonstrated the need for speed – time windows are short and organizations need to be able to find and fix vulnerabilities in the open source packages they are using quickly and repeatedly. 

This is exactly where SCA comes into the picture.  

Why is Software Composition Analysis (SCA) important?

More and more, modern applications are composed of open source code. It has been estimated that open source code makes up to 90 percent of the code composition of applications. Of course, applications are not only composed of open source. In fact, one of the challenges facing organizations trying to secure their code base is the fact that applications are assembled from different building blocks that all need to be secured to be able to effectively manage and mitigate risk.

Open source in 2021

Software is eating the world and open source is eating software. It is hard to overestimate the role open source is playing in driving digital transformation. Together with cloud and DevOps, open source is one of the key factors helping companies to digitize their services and to leverage their technology to better compete in today’s highly competitive market.

How does open source help? Well, building applications from scratch consumes time and resources. Using open source packages that provide the exact same functionality helps reduce these costs. Open source by its very nature is highly flexible and can be easily customized if necessary. Backed by the community, open source is often safer as it is vetted more intensely. Open source is of course free and also helps organizations avoid vendor lock in. 

All these benefits translate into increased efficiency and explain the high adoption rate of open source across organizations looking to speed up time to market. In a recent study by Tidelift, 68% of respondents pointed to saving money and development time as the top key reason their organization encourages the use of open source for application development. 48% cited increased efficiency of application development and maintenance as the reason. Open source usage was peaking well before COVID-19 but the pandemic accelerated adoption rates. Gartner now estimates that 90% of organizations rely on open source in their applications today.

Looking forward into 2021 there is little reason to believe this trend will slow down. Snyk’s 2020 State of Open Source Security report showed that open source ecosystems continue to expand, led by npm which grew over 33% in 2019, now spanning over 1,300,000 packages at the time the report was published. If GitHub’s 2020 State of Octoverse report is any indication, we can expect more open source contributions driving more open source projects. The report shows “an increase in developer connection and camaraderie through open source”, demonstrated by faster overall merge rates for pull requests in open source projects and a 25% uptick in open source project creation.

Modern software supply chains

Open source is just one piece of the puzzle comprising the modern, cloud native application. Applications today are more assembled than they are built. In addition to open source packages, they are assembled from proprietary code, containers, and infrastructure as a code to name just a few of the building blocks used as part of this new software supply chain, all of which a potential entry point for malicious actors. 

A vulnerability exploited in one part of the supply chain can be used to infect the entire application thus expanding the attack surface requiring protection. In the case of the Octopus Scanner malware, for example, GitHub discovered malware designed to enumerate and backdoor Apache’s open source NetBeans IDE. The method of attack here—affecting the supply chain by abusing the build process and causing its resulting artifacts to spread, with affected projects likely to get cloned, forked and used by many different systems—is what made this attack interesting, but sadly, not unique. The recent SolarWinds attack, this time targeting proprietary software, further demonstrates the growing risk the modern software supply chain poses for organizations.  

Open does not mean secure

Open source projects are considered to be safer to use. After all, when there’s an entire community involved in maintaining and developing a project, issues are identified and fixed more quickly. This includes bugs of course but also security vulnerabilities. Having said that, this does not mean that open source is without risk. If fact, one may argue that the very same reason why open source code is often considered as being more secure is also a chink in the armor. 

By definition, open source projects are public and visible to all. Malicious actors included. Any vulnerability discovered and fixed in them is implicitly exposed for attackers to find. The more popular the open source project, the more attractive the package is going to be as the impact of an attack is wider. Going back to the Equifax breach mentioned above as an example, the open source package used for the attack—Java’s Apache Struts library—is used by a huge amount of applications making the attack notorious for its wide blast radius.

Of course, organizations consuming open source do so “at their own risk”, as there is no vendor to notify them about flaws, or a signed contract that lets them shed the responsibility. The responsibility for keeping these components secure sits entirely with the consumer. 

5 Software Composition Analysis (SCA) challenges

As defined above, SCA is an umbrella term for application security methodologies and tools  that scan applications, typically during development, to map the open source components being used in an application, and subsequently identify the security vulnerabilities and software license issues they introduce. To successfully manage and mitigate the risk posed by these open source components, organizations deploying SCA methodologies and tools face a series of challenges related to the way in which open source is leveraged to build modern, cloud native applications.

1. Obscured visibility

The manner in which open source code is embedded into an application’s code base poses a huge visibility challenge. A developer might directly include a number of open source packages in his code, but those packages, in turn, rely on additional open source packages that the developer did not necessarily know about. These in-direct, or transitive, dependencies can go several layers deep, making it extremely challenging to gain end-to-end visibility into what open source is actually being used by an application. 

Exacerbating this challenge is the fact that the vast majority of security vulnerabilities is actually found in these transient dependencies. The Snyk 2020 State of Open Source Security report found that an overwhelming 86% of node.js vulnerabilities are discovered in transitive dependencies. Similar numbers were found for Java and Ruby. What this means is that the vast majority of security vulnerabilities in applications are usually going to be found in open source code that developers are not even aware that they were using in the first place.

Cloud native applications leverage open source in another way that can pose a visibility challenge for organizations, as one or more layers building up a container. Container images can consist of various open source components which also need to be identified and tested for vulnerabilities. The abstraction layer that containers provide developers with, an advantage from a development perspective, is also a weakness from a security perspective. 

2. Understanding the dependency logic

To accurately identify the dependencies an application is using, as well as the vulnerabilities they introduce, a deep understanding of how each ecosystem handles dependencies is required. Package resolution during installation, lock files, development dependencies—all these are examples of factors that affect how vulnerabilities in open source packages are identified and will determine subsequent remediation steps. It is important that an SCA solution understands these nuances to avoid creating too much noise with false-positives. 

3. Drowning in vulnerabilities

Visibility into vulnerabilities and the risk they pose to the organization is obscured by the sheer number of vulnerabilities identified. Referring to the Snyk 2020 State of Open Source Security report again, “the overall number of vulnerabilities reported across all ecosystems increased in 2019 after having shown a decrease in 2018.” The Snyk Intel vulnerability database added more than 10,000 vulnerabilities, also reflecting this continuous rise in the number of vulnerabilities.

What does this mean for organizations? Well, ultimately, these rising trends will usually trickle into their vulnerability backlogs, i.e. the list of vulnerabilities identified and requiring attention which will often consist of thousands of issues. Given the limited amount of resources development and security teams have at their disposal, it is extremely difficult to prioritize efforts without the right security skillset or tools that have advanced security expertise embedded into them. CVSS-based severities is the common method for assessing risk and prioritizing efforts but there are a few inherent weaknesses that make it difficult to use.

4. Find me a vulnerability database

Information on known vulnerabilities is distributed and diffused across various data sources. The National Vulnerability Database (NVD) is commonly used for receiving updates on vulnerabilities but there is a substantial amount of security intelligence on vulnerabilities that is available in other sources such as issue trackers, online forums, security newsletters, and more. NVD might also not add vulnerabilities in a timely enough fashion. 92% of the JavaScript vulnerabilities in NVD, for example, were added to Snyk beforehand.  This lag can be crucial considering the need for short-as-possible exposure windows. Knowing about a vulnerability in time can make all the difference. 

5. The need for speed

With developers moving at the speed of light, security teams are finding it hard to catch up. Pressed to deliver code more rapidly and more frequently, developers are increasingly adopting open source. Security teams, short on manpower and resources, have traditionally tried to put in place security checks at various different stages of the software development lifecycle but this has actually resulted in slowing down development. In other cases, perhaps more detrimental to an organization’s overall application security program, these checks end up getting bypassed or ignored. 

This has given rise to the notion of DevSecOps and Shifting Left in the security model—moving responsibility for security into the development teams to ensure minimum disruption to development workflows while also ensuring security. A new breed of SCA solutions was designed with this principle in mind, enabling the implementation of open source security testing early on in the development process. A developer-first approach, such as the one employed by Snyk, complements shift-left by ensuring developer adoption. More on that later.

Secure your applications

Automatically find, prioritize and fix vulnerabilities in the open source dependencies used to build your cloud native applications

How to choose a Software Composition Analysis (SCA) tool

SCA tools are not born equal and come in many different shapes and forms. With a market full of different vendors, it’s easy to be overwhelmed. Based on the challenges described above, belows is a list of key requirements any organization should consider when making the decision as to what solution to deploy. 

1. Developer-friendliness

The time and age where security simply hands over a list of vulnerabilities for developers to fix is long gone. That type of siloed approach is no longer tenable. DevSecOps calls for developers to take more ownership for security but that cannot be achieved if the tools they use work against them instead of for them. 

Developer adoption is key. If an SCA tool is too difficult to use or hampers development, it will not be used by developers and will not be of much use. An SCA tool, therefore, needs to:

  • be intuitive and easy to set up and use – Developers will appreciate tools that work the same way their existing tools work.
  • easily integrated with existing development workflows – A tool that is just click or two from integrating into a Git-based workflow, or that can be easily plugged into an IDE or CLI, will go a long way in driving developer adoption.
  • Provide automated and actionable advice – An SCA tool that not only surfaces issues but actually also guides developers on the path to remediation with automated and actionable fix advice empowers developers to take action. Snyk’s automated fix pull requests is a good example of this.

2. Ecosystem support & integrations

Without the ability to cover the languages being used to build your applications or fit into your development environment, an SCA tool is not going to be very helpful, right? This sounds like a pretty basic requirement but should not be taken for granted. Some SCA solutions might provide full language coverage but will not provide a Jenkins plugin to enable you to easily add application security testing as a step in your build process. Others might not provide IDE plugins which enable you to shift security far left in the SDLC.

For language support, verify that an SCA tool is able to provide security coverage for the core programming languages and frameworks used to build your applications. While this naturally varies from team to team, cloud native applications will most likely require coverage for: Java, JavaScript, Python, Go, .NET and Ruby.

Tip! Some of the languages in your technology stack might not need to be prioritized initially. Try evaluating an SCA tool based on the support it provides for the core languages you are using.  

For integrations, it is not only about breadth and being able to integrate across the SDLC, but also depth. Be sure that the SCA tool provides both an easy integration and one that actually provides results as expected. We will discuss API later, but the availability of a robust API is a big advantage here.

3. Dependency analysis

As reported in the Snyk 2020 State of Open Source Security report, 80% of vulnerabilities in open source packages are identified in transitive dependencies! 

This means that the vast majority of vulnerabilities in your code base are introduced by dependencies you had no idea you were using in the first place. These transitive dependencies can go several layers deep, making it extremely challenging to gain end-to-end visibility into what open source is actually being used by an application. 

Not only that, to accurately identify the dependencies an application is using, as well as the vulnerabilities they introduce, a deep understanding of how each ecosystem handles dependencies is required. Package resolution during installation, lock files, development dependencies—all these are examples of factors that affect how vulnerabilities in open source packages are identified and will determine subsequent remediation steps. 

When evaluating an SCA tool, it is important to verify that it can accurately interpret all the dependencies in an application to provide practitioners with full visibility

4. Vulnerability detection

An SCA tool must be able to accurately detect whether an open source package contains vulnerabilities or not. This depends, as discussed above, on the ability of the tool to understand the dependency logic, but just as importantly, on the security data the tool relies on. 

It is here that the differences between the SCA tools come more to light. Some SCA tools will rely solely on public databases, such as NVD. Others might augment public databases with additional publicly available vulnerability information. Some SCA tools—Snyk included—maintain a security team that combines multiple sources into one vulnerability database, continuously updated and enriched using various advanced analysis processes. Even there, there are nuances from solution to solution, pertaining to the quality of the database and the accuracy and comprehensiveness of the intelligence it provides. 

General speaking, when evaluating the vulnerability data an SCA tool provides, we recommend considering the following:

  • Accuracy – False-positives are unavoidable but a high rate of false-positives will waste resources and hamper developer adoption. 
  • Comprehensiveness – Relying on NVD alone is not enough. Be sure to select an SCA that combines public, private and internal security data sources.  
  • Timeliness – The speed at which newly disclosed  vulnerabilities are added into a database if critical. Can you afford not to identify when a package you are using contains a zero-day vulnerability?
  • Actionability – Select an SCA tool that provides rich and contextual information on vulnerabilities to help development take action.  

5. Prioritization

The number of vulnerabilities in open source components is constantly on the rise, with thousands of new vulnerabilities disclosed every year. SCA tools will often identify hundreds if not thousands of vulnerabilities that quickly pile up into backlogs that can easily overwhelm teams. 

Vulnerability prioritization solution by Snyk offers priority score and exploit maturity.

Since you cannot, realistically, fix all the vulnerabilities on the list, you need to decide which vulnerabilities offer the best return for time invested. These decisions will have a major impact on your effort to manage and reduce risk. Bad prioritization, leading to time wasted on false-positives, can cause friction and reduce developer trust which we already said was critical for DevSecOps and scaling security. 

At the very least, make sure that an SCA tool can help you prioritize well by:

  • going beyond CVSS scoring for risk assessmentCVSS is a good starting point but that’s about it. It lacks context and is often difficult to understand and use. 
  • providing deep application-level and business-level context on vulnerabilities – Context is king. Can the SCA inform you whether a vulnerability is reachable or not? Can it tell you whether it is exploitable? Does the SCA help you understand whether the vulnerability is in a component being used in production and by a mission-critical application?
  • automating prioritization – At scale, manually prioritizing is simply not possible. SCA tools that help you automate processes across projects and teams with policies should be considered over those who don’t.

6. Remediation

The vast majority of SCA tools provide the ability to identify security vulnerabilities in open source packages. While some go beyond this to support taking the next logical step—the remediation of vulnerabilities – remediation capabilities vary from tool to tool.  

Consider remediation advice as an example. It is one thing to suggest an upgrade for a dependency to a version fixing the vulnerability in question. It is another to calculate the minimal upgrade path so as not to risk breakage. And it is entirely a different thing to automatically trigger a pull request when a new vulnerability is identified with the recommended fix. 

When evaluating, dive deeper into the remediation advice the SCA tool provides around a vulnerability and the workflows it supports to drive actionability. Is there enough information available for understanding where and how to apply a fix? Are automated workflows available? 

7.  Governance & control

Does the SCA tool provide you with the control you need to control the use of open source in your applications? At the very least, ensure the SCA tool provides granular policies for defining and automatically enforcing the security and compliance guidelines accepted by your organization. 

8. Reporting

Being able to keep track over time of the various open source packages being used across the organization, including the various open source licences they contain, is important for various reasons and different business stakeholders. For example, security leaders will want to measure the success of SCA processes over time by answering how many vulnerabilities were identified and how many were remediated. Compliance and legal offices will most likely be interested in generating a BoM report, for an inventory on all the open source dependencies and licenses that impact the compliance posture of the organization. Verify an SCA tool provides you with the oversight needed to track your posture over time and enable you to generate, and share, a BoM report on your open source inventory.  

9.  Automation & extensibility

The larger you grow, the more challenging it is to perform all the manual operations involved in SCA processes. The ability to automate tasks such as adding new projects and users to be tested, or scanning new builds as part of your CI/CD pipelines, drives efficiency but also helps reduce friction with existing development workflows—a key ingredient for successful DevSecOps. 

One key requirement to consider is the existence of a robust API that enables the automation, customization and integration of SCA processes into your existing workflows and systems. Do you already have systems set up for security monitoring? An API should enable you to integrate results into these systems. Are you using build tools as part of your continuous delivery process? Make sure the SCA tool provides plugins for automating security testing as part of the process.

10. Cloud native application security

Modern applications are composed of multiple components that all need to be scanned and secured to provide end to end security coverage. In addition to open source dependencies –  containers, infrastructure as code and proprietary code are the building blocks used to assemble applications today. 

For container security, an SCA tool should be able at the very least to scan container images for security vulnerabilities and integrate into the workflows, tools and systems used to build, test and run them. More advanced solutions provide remediation steps as well. Snyk, for example,  identifies the container base image and recommends upgrades that will reduce vulnerabilities and also provides the container build commands and dependencies that introduce vulnerabilities, to simplify and speed up container remediation.

Software Composition Analysis (SCA) best practices

Selecting an SCA tool that answers the key requirements listed in the previous section is a great first step in successfully managing and mitigating the risk posed by the open source components used by applications. The manner in which the tool is implemented can have a big impact, and so here are a few guidelines to consider when deploying SCA.

Enable 

At the end of the day, developers are the ones applying the fix to an issue identified, and so they are the key to a successful deployment of an SCA methodology or tool. But they cannot be expected to assume full responsibility without help. A developer-first approach, such as the one Snyk advocates for and provides in its solution, empowers developers to take more ownership for security in two key ways – by providing them with developer-friendly tooling, but also by enabling them with continued guidance and support by the security team. It is not enough to hand developers with an SCA tool—an organization should be committed to enabling developers to use it with training and continuously monitoring results for improving.

Shift left

The old model by which developers are presented with a list of issues before a new build goes into production is no longer an option. Organizations are looking to speed up delivery pipelines, not slow them down. Identifying a vulnerability late in the software development lifecycle is simply too costly and so the earlier you can deploy SCA in the process, the better. This starts with the individual developer and his local development environment—deploy SCA as early as Integrated Development Environments (IDEs), and use CLI tooling, if provided by the SCA tool to surface and fix issues early on.

Automate

Automation offers a number of benefits, first and foremost, enabling organizations to speed up processes that would otherwise take up too much time. The larger the organization, the more acute this is. There are various ways in which SCA can be automated, and as already mentioned, a robust API is a key requirement for facilitating this. Automated testing of applications as part of the CI/CD process is the most common best practice. For governance and control, automated policies are a good way of automatically enforcing accepted security and legal boundaries. Snyk also provides automated remediation workflows, automatically opening fix and upgrade pull requests in SCMs such as GitHub and Bitbucket.

Prioritize

Organizations cannot fix all the vulnerabilities identified in their applications. Nor should they. Some of the issues found are likely not urgent or important. Some are false positives. To maximize the effectiveness of SCA, organizations should establish a prioritization strategy. Based on the capabilities provided by the SCA tool in use, organizations can decide to focus efforts on high-risk issues first or those issues that have an available fix only. Whatever security risk management approach selected, it should be communicated properly to practitioners and supported with automated policies for standardization across projects and teams.   

The future of Software Composition Analysis (SCA)

According to the 2020 Modern Application Development Security report by ESG, less that half of development teams currently utilize open source security testing tools. Only 38% of organizations are using SCA as an application security methodology. While this statistic is also validated by the Gartner SCA Market Guide cited above, this same document also reported a 40% increase in the number of end-user inquiries on SCA. 

Given the growing adoption of open source, together with the publicity of recent breaches and cyber attacks, this interest will likely rise in 2021. The role open source is playing in fueling digital transformation is becoming increasingly apparent and there is little to no reason to assume that these trends will change any time soon. 

Organizations are using open source to help them better compete in their respective markets while at the same time there is a growing understanding that they must control this usage by managing and mitigating the accompanying risks. Only SCA tools that answer the key requirements listed above will help organizations successfully achieve this goal.


About Snyk Open Source

Snyk Open Source helps organizations like Salesforce, Google and Facebook enhance application security by enabling development teams to automatically find, prioritize and fix security vulnerabilities and license issues in their open source dependencies and containers early in, and across, the SDLC.  Unlike other security solutions in the market, Snyk Open Source is a developer-friendly tool that integrates seamlessly into development workflows, providing automated remediation and actionable security insight to help organizations identify and mitigate risk efficiently. 

Find and fix vulnerabilities in your apps for free

Get started with Software Composition Analysis by trying Snyk for free.