Tackling the new npm@3 dependency tree

Until recently Snyk’s CLI tool only supported npm@2. That all changed when we released snyk@1.9.0 and added full support for the new npm@3 directory structures.

We wanted to share some of the technical challenges involved and the new tooling that came out of the process.

What’s different about npm@3

With npm@2 node dependencies would be installed into the node_modules directory of each respective node package. For example, the directories for the request module looks like this:

  request/node_modules
  ├── aws-sign2
  ├── aws4
  │   └── node_modules
  │       └── lru-cache
  │           └── test
  ├── bl
  │   ├── node_modules
  │   │   └── readable-stream
  │   │       ├── doc
  │   │       │   └── wg-meetings
  │   │       └── node_modules
  │   │           ├── core-util-is
  ...snipped

As you can see from the snippet above the node_modules appears a number of times already. This isn’t so bad, but it can lead to a lot of duplication. Popular utilities like lodash can appear in a project many, many times!

Originally, npm@2 does some work to de-duplicate this, but npm@3 completely flattens this directory structure in a bid to completely remove the duplication. The same request package looks like this with npm@3:

  request/node_modules
  ├── ansi-regex
  ├── ansi-styles
  ├── asn1
  │   └── lib
  ├── assert-plus
  ├── async
  │   └── lib
  ├── aws-sign2
  ├── aws4
  ├── bl
  │   └── test
  ...snipped

As you can see, it’s completely different, but importantly the way node requires modules is not affected at all.

How does this affect Snyk?

To start with, the CLI package walking logic needed a complete rewrite. Originally Snyk would walk your node_modules directory, then iterate through each sub-directory and build up a tree representation of your packages. Relatively simple really.

Except now, the flat directory structure with npm@3 does not represent your package relationships at all. For example the async package in the npm@3 listing above, is actually a dependency of the form-data package, which in turn is a dependency of request. But you can’t see that from the file tree.

So Snyk’s package resolution has been completely rewritten and extracted out into a standalone module called snyk-resolve-deps (open source under an Apache 2 license).

This module is used inside of the snyk CLI tool but can also be installed as a standalone CLI tool (installed using npm install -g snyk-resolve-deps gives a utility called snyk-resolve).

What the snyk-resolve-deps does is: first pass through the entire directory structure building up the physical tree. This physical tree is then passed to the next stage that creates a logical tree, which is the structure that represents where packages can be loaded from.

This means that both npm@2 and npm@3 directory structures are supported and create a virtual tree looking like this:

  ❯ snyk-resolve
  request@2.69.1
  ├── aws-sign2@0.6.0
  ├─┬ aws4@1.2.1
  │ └── lru-cache@2.7.3
  ├─┬ bl@1.0.2
  │ └─┬ readable-stream@2.0.5
  │   ├── core-util-is@1.0.2
  ...snipped

Ultimately this means Snyk can now happily support both your npm@3 installed projects just as well as npm@2. We’re able to find the correct paths for patching and able to report all the vulnerable paths accurately.

How is this different from ‘npm ls’?

If you’re familiar with npm’s tools, you would have heard of npm ls which is useful to see these trees.

The big difference between snyk-resolve-deps and npm ls is that our tool will show the complete logical tree and shows all ways through which a package entered your project.

If we look at how the request is included in the npm code (on the 2.x branch) we can see that npm ls is telling us one story (of what’s available on disk):

npm ls output

Whereas our own method of resolving tells a different story. This doesn’t mean that npm ls is wrong, it’s that we want to know exactly what is loading request, and our own snyk-resolve-deps can give us that:

npm ls output

As you can see, the request module is depended upon by many more packages than you might initially think. If that package had a vulnerability, the vulnerable paths are clearly known to Snyk now.

snyk-resolve-deps

The package snyk-resolve-deps is available today, under an Apache 2 open source license. Once installed globally, it’s available under the alias of snyk-resolve.

The CLI utility has a number of filters and flags, including --disk view (which reports very similarly to npm ls) and --filter X and --count X to filter and find occurrences of a specific dependency. You can find out more with snyk-resolve --help.


You can npm install -g snyk today and anonymously test any package or github repo and with a free account, you can start to monitor your projects for vulnerabilities today.

Interested in web security for developers?

Subscribe to our newsletter: