In this section
Enhanced Vulnerability Detection with AI
As an AI-first company, Snyk is deeply committed to integrating artificial intelligence into every facet of our engineering processes. This commitment drives us to continually seek out innovative ways to use AI to strengthen our security offerings. Recently, we successfully released the first iteration of two projects that exemplify our dedication to using cutting-edge technology to enhance vulnerability detection. These projects were strategically designed to minimize manual effort, significantly improve accuracy, and empower our security experts to concentrate on higher-impact, strategic tasks.
The problem: Manual analysis and information overload
In the realm of cybersecurity, staying ahead of vulnerabilities is paramount. Snyk prides itself on its extensive security data, a cornerstone of our ability to provide timely and accurate information about potential threats. Since the company's inception, a dedicated team of security professionals has been tirelessly working to ensure our vulnerability data remains up-to-date. This involves using various tools and automations to track emerging vulnerabilities, including zero-day exploits, and ensuring we are the first to inform our users about them.
However, this vital task is far from simple. Our analysts spend a considerable portion of their day, often several hours, sifting through a deluge of information. They scrutinize various events, such as comments on GitHub repositories, forum discussions, and security advisories, looking for any potential signs of newly discovered vulnerabilities. This process is akin to finding a needle in a haystack. The sheer volume of information is overwhelming, and the vast majority of these events — over 95% — turn out to be false positives.
Example scenario
Imagine an analyst encounters a GitHub comment that reads, "This code seems to have an issue with input validation." At first glance, this might appear to indicate a potential vulnerability. However, after extensive manual investigation of the conversation and the code change, the analyst might discover that the "issue" is a minor coding style preference and not a security flaw. This time-consuming process is repeated hundreds of times daily, leading to significant wasted effort and potential analyst burnout.
This manual, repetitive work not only consumes valuable time but also introduces the risk of human error. Given the vast amount of data, crucial information could be overlooked, leading to delayed vulnerability identification and potential security breaches.
Secure your AI future at Snyk Launch 2025
Join Snyk Launch to discover how to establish a foundation to build securely and confidently in the age of AI.
AI to classify potential vulnerabilities: Our solution
Recognizing this challenge, our Machine Learning team saw an opportunity to leverage the advancements in Large Language Models (LLMs) to augment a significant portion of this analysis. The goal was to develop an AI model capable of analyzing these events and accurately flagging those that truly represented potential vulnerabilities.
The initial iteration of this project involved creating a solution that utilized a simple one-shot prompting technique. This solution served as the core of our classification system. The key building blocks of the solution were the following:
Refined one-shot prompting: We ran multiple iterations of experiments to ensure that the prompt and the context we injected were sufficient. Just by using a raw event payload—in our case, a comment—we managed to get relatively good results, which means we saved on extra time and resource investment in fine-tuning.
Prioritizing high recall: Our primary objective was to ensure high recall, meaning we didn’t want to miss any actual vulnerabilities. Although initial experiments yielded a recall rate of 95%, we found that by changing the prompt to allow for an answer of “maybe” when it’s not sure, we managed to push the recall close to 100% while still keeping a relatively decent rate of precision.
The results
While keeping the recall at an impressive 100% — meaning none of the many vulnerabilities we discover each month are missed — we managed to flag dozens of those high-confidence non-lead events every week.
For us, this means a 10% reduction in the events that analysts have to deal with while our customers continue to benefit from the same security guarantees without any compromises and the potential for more high-quality findings from the security analysts.
Fix commit classifier: Addressing release analysis
The second project was designed to enhance the process of analyzing new package releases to identify if they contained fixes for known vulnerabilities, which was previously a manual, time-consuming, and error-prone task of reviewing commits in each new release by analysts.
The challenge of release analysis
Every software package undergoes updates and releases. Each release might contain numerous commits, each representing a change to the code. Identifying which commits address specific vulnerabilities requires a deep understanding of the code and the nature of the vulnerability. Manually reviewing these commits for every release was a daunting task.
For example, let’s say our analyst gets notified of a new version 1.9.1 of the React library, which has 492 (!) commits since the last 1.9.0 version. To determine if this release fixes a known vulnerability, the analyst would have to go through all the commits included in the release, compare them to all known vulnerability details, and assess whether the changes effectively address it. Of course, there is no way they would be able to do it manually, so they would mostly timebox this and focus on the highest severity issues or the ones that have been explicitly stated to have been removed in the release notes. Even then, the process could take hours, especially for packages like React, where each version has many commits.
AI-powered fix commit classification: Our approach
To address this challenge, we aimed to enrich release alerts with AI-generated recommendations on whether a commit fixed a specific vulnerability. This involved a multi-stage process:
Retrieving associated commits: First, we retrieved all the commits associated with a particular package version.
AI-driven fix identification: We then used a large language model to determine if any of those commits fixed a given vulnerability.
Enriching release alerts: If the AI identified a fix, we enriched the release alerts with this information, providing analysts with immediate insights.
The results
The biggest win was the discovery of previously unflagged vulnerability fixes. While developing the solution and assessing its accuracy metrics, we found more than 500 fixes that are now being added to our security database.
In addition, at a current rate of around 30 automated discoveries a month as part of this change, we can assume that a good portion of these are the extra findings that otherwise would have been missed.
Overall impact
With both of these enhancements now live, Snyk users are already benefitting from a faster time-to-discovery and an even richer security portfolio. This ensures that we provide the highest level of security at industry-leading speeds.
Last but not least, by accelerating key parts of the analysis process and providing AI-driven insights, we have enhanced the efficiency and accuracy of our security efforts. The integration of AI into our vulnerability management processes is not just a technological advancement. It's a fundamental shift in how we approach security, allowing us to stay ahead of emerging threats and protect our users more effectively.
Get started with Snyk AI code security tools for free
No credit card required.
Create an account using Bitbucket and more options
By using Snyk, you agree to abide by our policies, including our Terms of Service and Privacy Policy.