July 22, 20210 mins read
We’ve been asked to provide a comparison of scan times between Snyk Code and two common SAST tools: LGTM and SonarQube. For our research, we made several assumptions, but we’ve shared the details in order to be transparent.
Static Application Security Testing (SAST) can only be developer-friendly when it provides near real-time feedback and does not delay your development processes. Snyk Code is up to 106 times faster than LGTM. On average, Snyk Code is 5x times faster than SonarQube or 14x times faster than LGTM. In summary, Snyk Code proves to be one of the fastest semantic scanning engines on the market.
As scanners, we have the Community Edition of SonarQube which is a broadly used open source static analysis tool. It runs locally, so we needed to provide a quite decent PC. As previous tests using the free SonarCloud edition showed: SonarQube on a good PC is faster than free SonarCloud, so it is not unfair to use the local engine instead of the cloud version. We have to take one of the existing developer machines (details below). The second contestant is LGTM which originates from a company called Semmle which was acquired by GitHub. LGTM uses a deep semantic code search based on CodeQL. We used the LGTM SaaS offering.
Finally, Snyk Code is Snyk’s SAST solution. It is based on the former DeepCode scan engine, now with several months of additional development time within Snyk under its belt. Aside from being developer-friendly and highly accurate, one of Snyk Code’s design goals is to be extremely fast. It uses a proprietary constraint engine to achieve this. We selected this field as (1) the licenses allow us to run and compare, (2) these are semantic engines and not linters (like ESlint), (3) these are common and widely used engines, and (4) we have both a locally running and a SaaS in the field.
Speed is essential when a SAST solution wants to be developer-friendly. It’s widely known that developers are most efficient when security issues are identified during the development process so they can be addressed before the code gets checked in. The code is fresh in their minds. Snyk Code provides IDE plugins that embed seamlessly into the developer workflow. On top of that, Snyk Code provides easy to understand data flow diagrams and extensive explanations, including examples of fixes used in open source libraries with the same context. But for the purposes of this test, we focused on speed.
The way we tested is we ran a scan for each of the repositories — SonarQube locally, LGTM and Snyk Code as SaaS. As mentioned above, this led to our repo selection as we did not want to make scan times dependent on network bandwidth.
We also selected real-life open source repositories and not benchmarks as the quest here is to simulate what real developers do. Using benchmarks to compare SAST tools has its own problems, but in this case it would not help our cause.
Below are the general statistical values (rounded) in seconds of scan time. (n=48, rounded to integer):
Here are the results as a scatterplot (lower is better, logarithmic scale, done with R and ggplot2):
The logarithmic axis over-emphasizes the spread in the lower field. The standard deviation of Snyk Code is actually much smaller than the one of SonarQube for example.
Below is a box plot of the values (lower is better, we used a logarithmic scale to fit all values in the graph, done with R and ggplot2, median and average are shown):
The plot clearly shows that in the majority of cases, Snyk is dramatically faster than either of the other two. The median of Snyk is 6.7x (SonarQube) up to 16.4x (LGTM) times faster, which shows that the results do not rely on some extremely good outliers but instead are general ones.
In the above diagram, the spread of values in the LGTM column is noteworthy. It roughly lays between 2 minutes (somehow ok) and more than 17 minutes (not acceptable). Developers will not wait for the results taking several minutes during their workday. Luckily, the median of LGTM is around 3 minutes.
In comparison to the first speed test we did (that time as DeepCode, now Snyk Code) we noticed a lot of performance improvement because we optimized loading to the cloud service significantly in the past months. Also, this time, SonarQube was faster, probably because we used a stronger PC with higher grades of parallelism compared to the first test.
Another interesting observation is that SaaS versus locally-installed engines are not actually that different from a speed perspective.
Assumptions and limitations
We have chosen SonarQube (Community Edition 8.9.1) and LGTM as the license allows us these comparisons and they are broadly used.
We forced a full scan and did not use a differential scanning mechanism. For example, in the case of Snyk Code in IDE plugins, after an initial scan the IDE plugin does differential scans and saves bandwidth.
We used an Intel Xeon @ 2GHz with 16 cores and 64GB RAM
We also want to provide the raw data so you can apply your own statistics. (Times in seconds)