Tackling the new npm@3 dependency tree
Until recently Snyk’s CLI tool only supported npm@2. That all changed when we released email@example.com and added full support for the new npm@3 directory structures.
We wanted to share some of the technical challenges involved and the new tooling that came out of the process.
What’s different about npm@3
With npm@2 node dependencies would be installed into the
node_modules directory of each respective node package. For example, the directories for the request module looks like this:
request/node_modules ├── aws-sign2 ├── aws4 │ └── node_modules │ └── lru-cache │ └── test ├── bl │ ├── node_modules │ │ └── readable-stream │ │ ├── doc │ │ │ └── wg-meetings │ │ └── node_modules │ │ ├── core-util-is ...snipped
As you can see from the snippet above the
node_modules appears a number of times already. This isn’t so bad, but it can lead to a lot of duplication. Popular utilities like lodash can appear in a project many, many times!
Originally, npm@2 does some work to de-duplicate this, but npm@3 completely flattens this directory structure in a bid to completely remove the duplication. The same request package looks like this with npm@3:
request/node_modules ├── ansi-regex ├── ansi-styles ├── asn1 │ └── lib ├── assert-plus ├── async │ └── lib ├── aws-sign2 ├── aws4 ├── bl │ └── test ...snipped
As you can see, it’s completely different, but importantly the way node requires modules is not affected at all.
How does this affect Snyk?
To start with, the CLI package walking logic needed a complete rewrite. Originally Snyk would walk your
node_modules directory, then iterate through each sub-directory and build up a tree representation of your packages. Relatively simple really.
Except now, the flat directory structure with npm@3 does not represent your package relationships at all. For example the
async package in the npm@3 listing above, is actually a dependency of the
form-data package, which in turn is a dependency of
request. But you can’t see that from the file tree.
So Snyk’s package resolution has been completely rewritten and extracted out into a standalone module called snyk-resolve-deps (open source under an Apache 2 license).
This module is used inside of the snyk CLI tool but can also be installed as a standalone CLI tool (installed using
npm install -g snyk-resolve-deps gives a utility called
What the snyk-resolve-deps does is: first pass through the entire directory structure building up the physical tree. This physical tree is then passed to the next stage that creates a logical tree, which is the structure that represents where packages can be loaded from.
This means that both npm@2 and npm@3 directory structures are supported and create a virtual tree looking like this:
❯ snyk-resolve firstname.lastname@example.org ├── email@example.com ├─┬ firstname.lastname@example.org │ └── email@example.com ├─┬ firstname.lastname@example.org │ └─┬ email@example.com │ ├── firstname.lastname@example.org ...snipped
Ultimately this means Snyk can now happily support both your npm@3 installed projects just as well as npm@2. We’re able to find the correct paths for patching and able to report all the vulnerable paths accurately.
How is this different from ‘npm ls’?
If you’re familiar with npm’s tools, you would have heard of
npm ls which is useful to see these trees.
The big difference between
npm ls is that our tool will show the complete logical tree and shows all ways through which a package entered your project.
If we look at how the
request is included in the
npm code (on the
2.x branch) we can see that
npm ls is telling us one story (of what’s available on disk):
Whereas our own method of resolving tells a different story. This doesn’t mean that
npm ls is wrong, it’s that we want to know exactly what is loading
request, and our own
snyk-resolve-deps can give us that:
As you can see, the
request module is depended upon by many more packages than you might initially think. If that package had a vulnerability, the vulnerable paths are clearly known to Snyk now.
The package snyk-resolve-deps is available today, under an Apache 2 open source license. Once installed globally, it’s available under the alias of
The CLI utility has a number of filters and flags, including
--disk view (which reports very similarly to
npm ls) and
--filter X and
--count X to filter and find occurrences of a specific dependency. You can find out more with