5 security best practices for adopting generative AI code assistants like GitHub Copilot
March 5, 2024
0 mins readNot that long ago, AI was generally seen as a futuristic idea that seemed like something out of a sci-fi film. Movies like Her and Ex Machina even warned us that AI could be a Pandora's box that, once opened, could have unexpected outcomes. How things have changed since then, thanks in large part to ChatGPT’s accessibility and adoption! This year, a Gartner poll of top executives in May showed that 89% of organisations were investigating or implementing generative AI, whilst a separate Gartner report from the same period predicts that more than 80% of code in product development will be AI-generated by 2025.
Developers have benefited greatly from generative AI, using AI coding assistants like GitHub Copilot, Amazon CodeWhisperer, and OpenAI’s ChatGPT to supercharge their productivity. However, as powerful and life-changing as these AI tools are, they are still prone to errors and hallucinations and should only be used to augment, not replace developers. As such, careful human validation of AI-generated code, security tools and guardrails, and other measures to ensure safe use of generative AI coding tools should be implemented, so you can innovate fearlessly. Below, we’ll take a look at how you can safely adopt AI code completion tools (like Copilot) by applying these 5 best practices.
Practice 1: Always have a human in the loop
Not having (or having insufficient) human checks and validation is a classic error when it comes to adopting generative AI code tools. As mentioned above, these tools are just meant to assist developers and are not infallible, so developers should practice the same safe habits they had before the implementation of AI coding tools and continue to carefully check their code, AI-generated or otherwise.
Think of AI as an inexperienced developer that just happens to be able to read thousands of Stack Overflow threads at once. You'd never push a new developer's code without review, so don't let AI's speed trick you into thinking it's smarter than it is.
Teams should be educated on the risks of AI-generated code, along with the benefits, and regular code reviews should form part of internal software development practices. Validate, test, and correct in the IDE. These habits should be enshrined in business policies and procedures, and regular training should be conducted to ensure that teams understand the importance of such practices and implement the required code reviews appropriately.
Practice 2: Scan AI code from the IDE with a separate, impartial security tool
These two things go hand in hand. First, in order to fully recognize the boost that AI gives developers, you can't slow them down with a traditional security review. Developers already overwhelmingly outnumber security practitioners, which has necessitated teams adopting shift left security practices. Now that AI is boosting the output of developers, the volume of vulnerable code being created has also increased, so shifting left isn't a choice — it's a requirement. The easiest way to do that is to put security scanning right into the IDE, so code is scanned the second it's written, and vulnerabilities are proactively captured, rather than reactively hunted down after they proliferate across the development pipeline.
Secondly, that scanning should be done by a security tool that isn't the same tool that's writing the code. This is for the same reason that you have separate developer teams and security teams. If you let the team writing the code also secure the code, you're going to have a lot of vulnerabilities slip through. Security is complex and is a different discipline from development.
The AI powering the tool that writes code has been trained on functional code to write functional code. Whereas a purpose-built security tool that only does one thing — secure code — will only have been trained on security-focused data, to reliably detect and fix security issues. On top of that, security tools need to also be able to understand the full context of your application, not just the current code snippet being scanned, so security fixes don't create bugs elsewhere. This is why Snyk Code uses rules-based symbolic AI to scan the fix candidates provided by our LLM and then only provides users with fix options that won't create additional issues.
With AI, you'll need two different tools (one to write, one to secure) the same way you have two different teams (development and security). And both of those tools should understand the full context of your application so you don't end up with code snippets that only make sense in a vacuum.
Practice 3: Validate third-party code
On average, 70% of code in an application is open source code. That means 70% of your application was written and secured by someone not at your company who can't be held liable if a vulnerability ends up being a breach point for bad actors looking for customer data.
When developers use third-party dependencies, they should always scan them with a software composition analysis (SCA) tool to determine if they're secure. SCA tools will find vulnerable packages, report what the vulnerability is (type, severity, etc.), and suggest remediation paths.
AI-written code also pulls in third-party dependencies. Remember? AI is just another developer. One that happens to write code incredibly fast. That code's dependencies all need to be scanned too. Especially when you consider that an LLM-based AI tool will always lag a bit behind in terms of the latest findings about and releases of third-party packages. AI needs SCA. With this in mind, we recommend that you always manually verify AI-recommended open source libraries, and use tools like Snyk Open Source to manually test these libraries.
Practice 4: Automate testing across teams and projects
If it's not automated, there's a good chance it won't happen. Automation is such a foundational best practice that it can be seen almost everywhere. From Unix admins writing cron jobs to QA teams implementing automated tests to DevOps teams developing sprawling infrastructure held together with Python scripts. Automation doesn't just make life easier, it makes forgetting to do something impossible.
Be sure to implement security tools that secure your applications automatically from CI/CD, and in the workflows that teams already use.
Practice 5: Protect your IP
When you implement policies around AI tool usage, it is extremely important that you don't allow the tools to learn from your proprietary code. In 2023, we saw Samsung ban the use of ChatGPT after their proprietary data was leaked during usage-based training. The last thing you want is your competitive advantage being provided as suggested code to developers working for another company in your space.
This one is a bit tougher to enforce via technology, so it's extremely important to have your AI usage policies well-documented and your teams well-trained (with permitted usage, mandatory practices, and potential consequences clearly illustrated). In addition, assume that whatever data you input into an LLM will be used in its training. Only give the LLM the minimum of information (none of it confidential) it needs to do its job, and consider implementing input and output checks, to sanitize inputs from users and outputs from your LLMs.
Use AI coding assistants safely
Make no mistake; AI coding assistants are the future. Your teams will move faster than ever, generating more code than ever, and it's up to you to make sure that the applications they ship are secure. You'll need the right policies, you'll need the right training, and most importantly, you'll need the right security tooling to keep teams moving fast. This is where Snyk can help.
Backed by industry-leading intelligence and expert-in-the-loop hybrid AI, Snyk's developer security platform scans code as it's written (by human or AI), providing one-click fix recommendations in-line within the IDE, so developers don't have to slow down and security teams don't get bogged down.
Start trusting your AI-generated code with Snyk. Book an expert demo today to see how Snyk can be the security copilot that Copilot needs.
Start securing AI-generated code
Create your free Snyk account to start securing AI-generated code in minutes. Or book an expert demo to see how Snyk can fit your developer security use cases.