Enhancing security testing for Go projects using DepGraphs
2020年8月26日
0 分で読めますWe’re happy to announce the vastly improved performance of security testing for Go projects via the Snyk CLI, in some cases improving scan time by more than 90%! This improvement—soon to be introduced in additional languages—was made possible by changes applied to our scanning method that enable Snyk to handle huge projects, even ones the size of Kubernetes! (more on this below).
Dependencies, dependencies, and more dependencies
Applications and projects come in different shapes and forms but they all have one common denominator—they include open source dependencies. As a member of Snyk’s Language team, I can also confidently claim that projects starting off as small projects and consisting of only a small number of dependencies, quickly grow into larger projects, consisting of tens, if not hundreds, of both direct and transitive dependencies.
The number of dependencies in a project can have a direct impact on scan performance. When some of our users complained of sluggish performance when scanning or monitoring their Go projects, we started to look into a long-lasting solution that would scale together with our users and their projects. The solution we eventually chose was a migration of our application’s data structure from dependedencyTrees
to dependencyGraphs
.
DepTrees vs. DepGraphs: enhancing security testing
When scanning a project’s manifest file, a dependencyTree
is built that lists all the various open source dependencies it is calling—direct and transitive. While good enough for small projects, this method proved to be problematic with larger projects. The dependencyTrees
became too large and ended up with a costly memory footprint.
For the sake of illustration and perspective, let’s take a look at a basic example, using a dependencyTree
to handle a small Go project.
Basic example of using a dependency tree to handle a Go project
The resolved dependencies for this small Go project, using dependencyTree
, looks as follows:
Even with only three direct open source dependencies, the transitive dependencies included result in an 11MB dependencyTree
. Imagine a project with more than 100 direct dependencies. The resulting dependencyTree
would be huge, and would most likely slow down scans and even crash a CLI scan with an OutOfMemory
error.
The data model provided by dependencyGraph
solves this issue. With the help of vertices and edges, we no longer have to include the same dependency multiple times. This results in huge memory savings, as seen in the image below:
For the same basic project we saw before, we are now looking at a 51KB dependencyGraph
. That is a big reduction from 11MB!
Boosting performance for...Kubernetes security scanning
The results of migrating from dependencyTree
to dependencyGraph
were extremely impactful, eliminating any previously-experienced sluggishness for Go projects.
One interesting success story was that of a project some of our readers may have heard of before—Kubernetes.
We had encountered some difficulties when attempting to scan the Kubernetes project for vulnerabilities using Snyk, a direct result of the previous dependencyTree
data model, and the sheer size of the project.
The new dependencyGraph
helped solve these issues. Testing Kubernetes main repository via the Snyk CLI is now fast and results in a 1.1MB file—a totally acceptable size for such a large and multi-layered project.
What’s next?
Understanding that performance at scale is a key concern for our users, we are committed to continuously improving our language and ecosystem support. The migration to dependencyGraph
is one such example, and the feedback from our users following this change has been extremely positive.
As mentioned, these changes have recently been applied to the Snyk CLI, for Go projects. Java users will be pleased to know that we’ve also migrated to dependencyGraph
for Java Gradle projects as well. Again, CLI only.
The even better news is that we are planning to speed up the migration to the new data structure. First, for additional languages via CLI (npm, yarn, maven, sbt
), but then also outside the CLI, to our Git-based integrations.
More news on this soon, so stay tuned and even more important — stay secure!