Skip to main content
Episode 171

Season 10, Episode 171

Vulnerabilities In Enterprise AI Workflows With Nicolas Dupont

Hosts
Headshot of Danny Allan

Danny Allan

Listen on Apple PodcastsListen on Spotify PodcastsWatch on Youtube

Episode Summary

As AI systems become increasingly integrated into enterprise workflows, a new security frontier is emerging. In this episode of The Secure Developer, host Danny Allan speaks with Nicolas Dupont about the often-overlooked vulnerabilities hiding in vector databases and how they can be exploited to expose sensitive data.

Show Notes

As organizations shift their focus from training massive models to deploying them for inference and ROI, they are increasingly centralizing proprietary data into vector databases to power RAG (Retrieval-Augmented Generation) and agentic workflows. However, these vector stores are frequently deployed with insufficient security measures, often relying on the dangerous misconception that vector embeddings are unintelligible one-way hashes.

Nicolas Dupont explains that vector embeddings are simply dense representations of semantic meaning that can be inverted back to their original text or media formats relatively trivially. Because vector databases traditionally require plain text access to perform similarity searches efficiently, they often lack encryption-in-use, making them susceptible to data exfiltration and prompt injection attacks via context loading. This is particularly concerning when autonomous agents are over-provisioned with write access, potentially allowing malicious actors to poison the knowledge base or manipulate system prompts.

The discussion highlights the need for a "secure by inception" approach, advocating for granular encryption that protects data even during processing without incurring massive performance penalties. Beyond security, this architectural rigor is essential for meeting privacy regulations like GDPR and HIPAA in regulated industries. The episode concludes with a look at the future of AI security, emphasizing that while AI can accelerate defense, attackers are simultaneously leveraging the same tools to create more sophisticated threats.

Links

Nicolas Dupont: Vector embeddings are just math, right? They are representations of context, of semantic meaning, right? And so it makes sense both theoretically and in practice that these dense representations can be inverted back to the text in the original modality or whatever that is. The problem there is that there's a common misconception that they are like one-way hashes or just you don't really think about it because they're not humanly – they're not intelligible to humans.

Danny Allan: It's numbers.

Nicolas Dupont: Numbers. Exactly. How bad could it be? But they need to be treated with the same level of sensitivity as the original data they represent. And, furthermore, they are being centralised into these places. And so you're compounding the risk without addressing the foundational problems.”

[INTRODUCTION]

[0:00:44] Guy Podjarny: You are listening to The Secure Developer, where we speak to industry leaders and experts about the past, present, and future of DevSecOps and AI security. We aim to help you bring developers and security together to build secure applications while moving fast and having fun.

This podcast is brought to you by Snyk. Snyk's developer security platform helps developers build secure applications without slowing down. Snyk makes it easy to find and fix vulnerabilities in code, open source dependencies, containers, and infrastructure as code, all while providing actionable security insights and administration capabilities. To learn more, visit snyk.io/tsd.

[INTERVIEW]

[0:01:24] Danny Allan: Hello, and welcome back to another episode of The Secure Developer. I'm Danny Allan, CTO at Snyk. And I'm super excited to be with you today because I have a very special guest that I have known, I'm going to say almost a decade now, going back through several companies, and that is Nicolas Dupont from Cyborg. Nicolas, maybe you can just introduce yourself to the audience, what you do, and what Cyborg is all about.

[0:01:47] Nicolas Dupont: Yeah, hey Danny. Really good to see you again. Thanks for having me. I'm Nicolas Dupont, the CEO and founder of Cyborg. You may know me as Nicolas, Nick, Nico, depending on where and when you met me. And I've been working in the technology industry both as a software engineer and an entrepreneur for about a decade. The first four years working on lossless data compression, and then the ensuing five, six years working in privacy-enhancing technologies and applied cryptography, namely encrypted vector search, which is what we're currently working on at Cyborg to be able to address some of the lesser-known security vulnerabilities with agentic AI applications.

[0:02:31] Danny Allan: I love the fact, Nicolas, that you've been in security for so long. I always tell people that I started in security doing homomorphic encryption and kind of things around that. And I drifted off into the cloud and back up in other areas. And here I am back full bore in security. But I love that you've been security all along. And in fact, I think when I first met you, you were doing sharded encryption across clouds of data storage. Isn't that right?

[0:02:58] Nicolas Dupont: That's right. That's right. Distributed encrypted search. Similar technology, what we're working on today. But we were still finding our footing as to where the real value was in the market. And I think we're in a better place now, I hope, than when we were working on the distributed stuff.

[0:03:14] Danny Allan: Well, the interesting thing about security is there's always application of security to everything we're doing. In fact, I think about here at Snyk, where I work, we're always looking at security for AI is the big thing right now. How do we take AI, actually and use it within our platform for better security? And so AI for security in that case. But you tend to work on security for AI at Cyborg. Rather than using AI within your platform, you are, “How do we secure the implementation of AI?” Am I understanding that correctly?

[0:03:51] Nicolas Dupont: Yeah. Yeah, that's exactly right. And maybe it would be helpful for me to zoom out, lay the landscape a little bit, as to what we're trying to solve, to set the conversation. Over the past, call it, three years, the centre of gravity in the world of AI was largely around training. Who had the GPU clusters? Who was able to train the biggest models, frontier capabilities, et cetera?

Today, we're kind of at an inflection point where all the stakeholders involved, namely customers and investors, are asking to see ROI from their massive AI investments, both in hardware and in software. And the way that you get ROI from AI is through inference, not by building the model, but by letting it run and do what it was built to do at scale.

And in the context of the enterprise, in enterprise AI, in order to drive value, a model needs to have access to proprietary data to be able to glean insights from that proprietary data in order to take action off of that data. And that creates a whole host of vulnerabilities in that enterprise AI context, namely you're centralising data that was traditionally siloed across different sections of the organisation, from finance, and HR, CRM, and marketing, engineering, so on and so forth, into a centralised vector database, or a vector store, or a set of those that makes the attack surface smaller but makes the blast radius of a breach there significantly larger.

You add in the context of agents to the flow that need access to that data and need to be able to make autonomous decisions off of that data. But typically, you're over-provisioned in the access that they have to the data. That creates another vulnerability. This vector store typically also is used or can be used not only as a knowledge base but also as the memory of the agentic AI system, and that opens up the threat vector of prompt injection.

The list goes on. And I think I'm hinting at what we're focusing on here, which is the vector store, the common denominator across all these things. Vector databases centralise all this data. They store these dense semantic representations of the original data as vector embeddings, which can be inverted back into the original modality of text, images, audio, etc., relatively trivially. And they are comically underequipped to be able to deal with the security risk that they pose today. That's what we solve.

[0:06:23] Danny Allan: And so in an LLM world or an AI world, rather than a structured database, you have a vector database. And what I'm hearing is that people assume that it can't be reverse-engineered to extract the data. But the reality is it can be reverse-engineered to extract the original data from the vector database. Is that –

[0:06:42] Nicolas Dupont: Yeah, that's right. I mean, it's really – vector embeddings are just math, right? They are representations of context, of semantic meaning, right? And so it makes sense both theoretically and in practice that these dense representations can be inverted back to the text in the original modality or whatever that is. The problem there is that there's a common misconception that they are like one-way hashes or just you don't really think about it because they're not humanly – they're not intelligible to humans.

0:07:12] Danny Allan: It's numbers.

[0:07:13] Nicolas Dupont: Numbers. Exactly. How bad could it be? But they need to be treated with the same level of sensitivity as the original data they represent. And furthermore, they are being centralised into these places. And so you're compounding the risk without addressing the foundational problem. And so you can address maybe the prompt injection risk by having guardrails of whatever input is coming in, including the context added in a RAG scenario, but then you're still leaving all of the gaps, the other gaps unaddressed by leaving these vector embeddings in plain text and easily accessible and leaking everywhere.

[0:07:46] Danny Allan: Let's start with that, then, the breach of a traditional structured database. I tend to think of SQL injection, someone inserts a command, extracts the data, and dumps it out. And so obviously, we encrypt our structured databases. How does a breach take place in a vector database? How do you get into the data within that database? And what is the specific attack model that that malicious individual or a malicious actor would take?

[0:08:12] Nicolas Dupont: To answer that question, I'll caveat it first by saying that we're still in the infancy of production rollout of AI in enterprise. We haven't seen any of these attacks in the wild, but we've seen them in controlled environments as exploit demonstrations. The vectors through which you can attack a vector database, no pun intended, are pretty similar to the way that you would attack a standard database. You've got the injection attacks, you've got any type of infrastructure compromise, leaked SSH keys, having open ports, etc., unsecured infrastructure. And you also have the fact that most vector databases have only one way of securing access to the data for both injecting data into it and extracting from it, which is an API key.

Now, API keys are notoriously leaked all the time, whether you're a startup and you commit something to Git. Or even Microsoft has this happening on a weekly basis. And so it's not a good factor for the security of this centralised knowledge base. And then furthermore, once you get in, compared to a standard relational database that has the past 50 years of learn experience in terms of securing access with various levels of encryption at rest and times in use, vector databases rarely have any type of encryption at all. Those that do have encryption at rest and in transit, but not in use. And that's both as a factor of the infancy of the technology and also an architectural decision. Because vector databases, compared to relational databases and key value stores, NoSQL databases, need the vectors to be in plain text in order to do the similarity search, right?

When you're doing a query on a vector database, you're not doing a where clause with SQL and scanning through or just looking at an index. You are providing a query vector embedding, and you're trying to find all of the items that are the closest to that for top K. And in order to do that, you need to literally compute the Euclidean distance or the cosine angle between that query vector embedding and all the candidate embeddings. There are ways to accelerate it through approximate nearest neighbour search in algorithms. And all the vectors do this. But the common denominator here is that they're all doing this in plain text.

If you want fast, low-latency access, which is typically what you care about in a RAG scenario, in agentic AI, etc., high-throughput, low-latency applications with no patching, you run into this problem where, if you have to read from disk from encrypted, and each time you encrypt it, you get significant performance hits. And so none of them are doing that. And they're kind of throwing away security and that granular data control for performance.

[0:11:02] Danny Allan: And so is your TLDR on this that they should be encrypted, knowing that there's a performance hit when you're trying to query and find the nearest neighbours on the vector embeddings?

[0:11:13] Nicolas Dupont: No. I mean, fundamentally, if you're trying to pick between functionality and security, unfortunately, as you and I both know, functionality almost always wins out. And that's not what we want to hear, but that's just the truth. It's like the laws of physics. What we believe is that there needs to be innovation in this space that can address the performance hits from encryption to be able to mitigate those to a point where it's an acceptable trade-off. And so that's exactly what we've been working on, on being able to do that similarity search while keeping the vectors encrypted.

So, not adding the overhead of decrypting everything in order to be able to do the similarity search, but actually do the similarity search over ciphertext using standard cryptography, such that the overhead is significantly mitigated instead of being several orders of magnitude slower. You're talking about a 10% to 15% performance hit.

[0:12:04] Danny Allan: Yeah. In many ways, this brings me back to homomorphic encryption, because the thesis there always was I'll deal with the data while it's still encrypted, and be able to do the analysis that I need to do. This is doing the same thing, but for vector databases, which, of course, are behind AI.

Talk to me a little bit, then about prompt injection on this. You have these vector databases. They're storing all the numerical values of the text embeddings and all the content that you're querying on. How does prompt injection work against a vector database? And what should we be thinking about there for security?

[0:12:39] Nicolas Dupont: Great question. I mean, it's not something that architects typically think about when building out their AI applications as the vector database being a threat vector for prompt injection, but it really is for two main reasons. I mean, really, for one reason and multiple vectors that can inject that. That is the vector database is there in order to be able to supply context to the LLM to be able to answer a question or to be able to take an action in an agentic AI workflow.

And so that context is fed typically in the original modality or in some sort of pre-processed text, image, whatever the modality of the LLM, to be able to take that action. And so that's a vector to be able to do a prompt injection attack. That can bypass the user prompt. And so if you've got guardrails at the user prompt but not at the context phase, if the context can carry a malicious payload, then the prompt injection can happen through that.

And so without proper access controls around your vector database, you not only have to worry about exfiltration of the data and getting a nice cross-section of the organization's centralized knowledge base, but also injection of malicious data that then can poison further downstream applications.

[0:13:58] Danny Allan: That is impacting at some level the integrity of the model itself. The exact future queries.

[0:14:05] Nicolas Dupont: Exactly. It's like adding a system prompt that is subverting the intent of the user at every query.

[0:14:12] Danny Allan: And is this all theoretical at this point for developers? Is this something that we've hit in the real world, I'll say, in practical areas that attackers have gone in and tried to do this exact thing against vector databases?

[0:14:26] Nicolas Dupont: The prompt injection one, we haven't seen. The leaked API keys, we've seen everywhere. And so it's not perfect – unfortunately. The prompt injection attack is one that we've seen in academia. It's one that we've done demos of on stage at conferences, live demos. And it's one that, because it's relatively trivial to do using one of these entry points, it's only a matter of time until we see it, especially as enterprises start to ramp up and out these enterprise AI applications into production in consumer, customer-facing environments.

[0:15:04] Danny Allan: And what is the right compensating control? When I think of injection on a relational database, I think if I go back 20 years, it was prepared statements or stored procedures or whatever. What are the compensating controls against that type of attack on a vector database?

[0:15:22] Nicolas Dupont: Well, it's really control of who and what has access to – has the ability to add data to the database, right? And so it gets very hairy when you get into agentic workflows where the agents are overprovisioned with writes. And especially if you're using the vector database not only as a knowledge base that may be relatively static in the organisation, but also as a memory store for further chats or for further workflows that need to be run.

And so in that case, you now have the complexity of controlling everything that's added to the vector database not only during index building time and the pre-processing of the knowledge base, but at inference time at all times, right? That requires, in my opinion, two things. Firstly, limiting the authorisation scopes of who and what has access to be able to add data to the database. And then secondly, ensuring that there is a robust foundational layer of security at the vector store basis, such that if an API key is leaked, if an agent is overprovisioned, you still have a cryptographic backstop that essentially provides you with mathematical guarantees that data could not be added to it without proper access, without having access to the encryption keys, or data cannot be read from it such that.

And in this rapidly evolving world, I think that, again, trying to address each one of the gaps with point solutions as opposed to addressing kind of the security basis of the vector store is playing a game of whack-a-mole. And we're going to keep seeing new vulnerabilities pop up as this area changes so rapidly, and it's going to be impossible to keep up with it.

[0:17:06] Danny Allan: One of the things that we've been struggling with is the security of humans in an AI world versus security of agents. Do you think that compensating controls are different? In other words, you have humans going in and accessing the vector database and doing the queries on the LLMs and that. You also have agents, as you mentioned. Do you think it's the exact same controls between them, and it's just a matter of type? Or, no, there are actually distinct controls that you'd want to implement?

[0:17:37] Nicolas Dupont: I think the latter. I mean, it's every layer of the stack, right? When you're looking at the vector store, fundamentally, what you should be looking for is encryption and use. Is granular encryption to where each vector is encrypted, all of its associated metadata, IDs, and contents stored within the vector store.

[0:17:54] Danny Allan: Uniquely for each vector? Unique cryptographic key for each vector?

[0:18:00] Nicolas Dupont: Either that, or on a per-index, per-namespace. That really depends on the application and what the threat vector and the threat model specifically is. And if you want to also have cryptographic role-based access control, such that each vector has its own entity and access control list that then can be reclassified without needing to reencrypt the vectors themselves.

But having that basis then allows you to build up the stack to use those primitives to ensure that the security is robust all the way upstream. And so I think once you get to the agentic workflows and have that application layer where you're building and tying in authentication and authorisation workflows for those agents, then I think it does diverge. And I think some of your previous episodes on this podcast will do a much better job than I will talking about that distinction. But I think it really depends on the application whether the agent is treated as a service role that has access to data, whether there's a human in the loop to be able to provide authorisation based on certain break-the-glass scenarios. That largely depends. But I think fundamentally, from the vector store and the data plane, it's all the same. What matters is granular cryptography that allows you to be able to enable that flexibility while maintaining the common security throughout the applications.

[0:19:27] Danny Allan: It's definitely interesting. I see a lot of applications that are built such that there's just-in-time encryption keys for individuals and humans. And they're doing generally the right things. But when it comes to agents, the agents are wide open. They can do everything from delete files to drop tables in a database, do whatever they want because –

[0:19:45] Nicolas Dupont: We saw it happen in production, right, with Replit.

[0:19:48] Danny Allan: Yes. Yeah. There was something just recently. A CEO was playing around, and he lost a database, right?

[0:19:52] Nicolas Dupont: Yeah. Yeah. It was something like a couple thousand of CISO entries in their production database. It's crazy stuff.

[0:20:01] Danny Allan: Yeah. And it's because agents are assumed to have more, I don't know, controls. Or they're trusted more for some strange reason, which in my mind they shouldn't be. It's just another mechanism service account that is no more trusted than an individual. In fact, maybe less trusted, especially when they become autonomous.

[0:20:18] Nicolas Dupont: Yeah. Absolutely. And we're seeing some concerning trends. I think it might have been Synk that had a developer survey recently that showed that developers believe that AI-generated code was more secure than human code, right? I mean, just with any anecdotal evidence, I can attest that that's not the case at all. And you will tell Claude Code, "Hey, are you sure this is secure?" "Oh, you're absolutely right. This is not." And without that initial pass, how can you trust that giving it access to everything within your organisation with this overprovisioned role, but nothing will go wrong? And I think it's a ticking time bomb.

[0:20:58] Danny Allan: Well, it's very clear we need security for AI and specifically security for vector databases behind it. This is all a security conversation, though. And one of the things that I think we often miss or forget about is privacy and delegating access. We have an entire legislation around this over in the EU GDPR about delegating access to data. Is there implications in vector databases and LLMs that also matter when it comes to privacy, and the right to access, and the right to be forgotten? Are there ways to address that in these vector databases?

[0:21:34] Nicolas Dupont: Yeah, 100%. I'm glad you bring that up. Because I often like to say, you can have security without privacy. We see it all the time. You can't have privacy without security, right? And so within that context, a cryptography is one of privacy's best friends here. Because A, it allows you to have robust controls around what can be accessed when and how through these mathematical guarantees of the confidentiality of the data in its encrypted state.

And additionally, it makes the job of compliance with these regulatory frameworks significantly easier by reducing the burden by ensuring that your data is encrypted at all times and that you don't have to rebuild, re-engineer the wheel each time that you're building an application. And the same thing is true in enterprise AI scenarios. Whether it'd be with GDPR or in applications in traditional regulated industries like healthcare under HIPAA, financial services under FINRA and PCI DSS, federal applications under FedRAMP, you have, you know, robust security and privacy requirements that need to be met already, onerous to be able to meet today. And you're adding in a centralised vector store that contains all of the data, and where you're doing access control just based on metadata filtering, where a developer changing a single equal sign within a MongoDB query expression can completely change their access control, is not robust, right? And in terms of the scrutiny for auditing, in terms of the risk exposure, it's pretty large.

Now, these regulated sectors are lagging in terms of the AI adoption curve, especially when it comes to production applications that use touch-safe to protect the health information. But the risk is coming there. And so our message to companies operating in this space is that start from security. It will be significantly easier to reach your compliance status by starting from a technically secure architecture than it is to try to check the legal or check boxes as you're building it, and not really focus on the technical architecture as a whole.

[0:23:51] Danny Allan: Which of the vector databases do you see? And maybe you can't answer this, and maybe I shouldn't be asking the question. But which of the vector databases do you see as being the most progressive in terms of offering security controls and privacy controls within their platforms?

[0:24:08] Nicolas Dupont: It's a difficult question to answer because I don't have visibility obviously on their road maps and with their work on the enterprises, but they're all moving towards enterprise, right? They're all going to face these challenges. I would say the ones that are the furthest along the curve are actually the open source databases primarily because they provide that level of control for seasoned DevOps and infrastructure folks to be able to come in and deploy in a more secure manner than a fully hosted application, a fully hosted vector database, whether it be Pinecone, Qdrant, Cloud, whoever insert here, where you're only security is an API key, which as you and I both know is not secure. And you're trusting that their infrastructure upstream is secure as well.

But even then, you're still facing the same thing, right? You take Chroma, which is one of the most popular open source vector databases out there, right? It uses SQLite as its backend. You can connect to the backend without a problem. If you have access to the same infrastructure, you're on the same VM, and you can access the backend without any credentials. You can extract the embeddings. You can invert them. I did it live on stage a couple of months ago. It's pretty trivial. And so the problem is, even there, if you know how to secure your infrastructure, you still have this application layer hole in your security posture that, without any type of encryption in use, you have no stopgap to prevent a catastrophic breach.

[0:25:40] Danny Allan: And this is your whole point that if someone gets in via an RDP vulnerability or SSH tunnels into the environment, they get access to the environment, you need that protection of the vector database because it can be reverse-engineered. You can extract the data that went into it, and you lose all security, you lose all privacy essentially in that model.

[0:25:57] Nicolas Dupont: Exactly. There is no more perimeter. Your data is everywhere today. And even wherever your vector store is running, whatever that is, you really should seek out the most granular form of security. Start there, which is granular cryptography. And then layer on everything on top of that. But if you start from that point, it is significantly easier to guarantee a secure infrastructure than it is to try to go the opposite way and address the vulnerabilities as they come when you have a big architecture hole in your design.

[0:26:30] Danny Allan: Nicolas Dupont: So you just mentioned granular encryption. I guess if you bring this back to the developers, a developer is writing an application, and it's backed by an LLM, and it has vector data storage and vector databases behind it, what are your pragmatic suggestions for developers to ensure they have the security and controls in place that they need to protect the customers and users of the platform?

[0:26:52] Nicolas Dupont: Yeah. I mean, great question. I think it's look at the tools that you're planning to use, right? A vector database, a graph database, whatever your store is, whether you're doing RAG, graph RAG, whatever your flavour is. At the end of the day, the threat model is all largely the same, right? You're injecting context into a prompt, into a system prompt, whatever that is, in order to be able to augment the LLM's knowledge or context at that point in time to be able to produce the desired output or desired action, right?

And so that central point of the vector store of the GraphDB that you're using, look at their security docs, look at their posture. See whether they've invested time and really thought about this problem. Look for terms like encryption in use, anything like that. I would advise most engineers not to try to roll your own crypto, as they say, to implement it yourself. It's a difficult thing to do securely. And I would just say pick the tool that's going to provide that security from the ground up. Because at the end of the day, we are moving towards commoditisation on a lot of these applications, which is good. At the end of the day, one day an LLM is going to be an LLM. A vector database is going to be a vector database.

And so, the tool that you use matters less as long as it breaches the table stakes requirements that you have. But security right now is something that very few address. So look for those terms. And make that decision early on as opposed to later when you're actually facing the risk, because it's much harder to secure it after the fact.

[0:28:30] Danny Allan: Do you think it's going to take a major breach and a compromise to wake the industry up to this? Or do you think that people will become aware of this without that major breach taking place? The reason I ask is one of the things that worries me is that we don't seem to learn about things until someone drives a car off a cliff. Do you think that's the case here? Or do you think that this is similar enough to relational databases that people will understand this and begin implementing even before we have that reverse engineering of a vector store?

[0:29:02] Nicolas Dupont: My optimistic hope is the latter. My realistic fear is the fore. It's that you know the fact that people know how to secure relational databases can actually cut against this by thinking, "Oh, well, my standard – my Postgres installation is secure. I've got a firewall around it. I'm good to go. I haven't breached on this." Right? And so that standard cyber hygiene is good to have, but it's table stakes, and it does not necessarily apply one-to-one in these enterprise AI workflows. I think that can actually hurt the cause.

And unfortunately, we're in a weird position where we don't want these breaches to happen, but it's going to be good for business at the end of the day. So we want people to wake up to the risk and address it because it's much easier to do so now than it is in the future when you're still pre-production with your enterprise AI applications. But we've seen this with basically every attack vector until it happens in the wild, and there are dollars assigned to fix it. It's usually pretty low in the list of CISOs that are swamped with many other risks to address.

[0:30:14] Danny Allan: Yeah. And what I really hope is not the case is that where organisations wait until there's regulations. Because to your point earlier, they often lag by a year or more sometimes, because legislators just – they're not aware of how all of this works. And so shining the light on it early and thinking about it early, secured inception is what we call it, is an important part of this.

[0:30:36] Nicolas Dupont: Right? It's the only way to build securely. I mean, we still haven't regulated email today. So, I'm not hopeful that we'll have enterprise AI figured out soon.

[0:30:45] Danny Allan: We haven't figured out spam. And it's been 30 years I've been getting spam in my inbox, Nicolas. I don't know. This world is crazy. What makes you most optimistic or excited about the future? You're obviously neck deep in all these security challenges and problems. Is there something that you are truly excited about that you think will change the trajectory of the industry?

[0:31:04] Nicolas Dupont: Yeah. I mean, I'm hugely optimistic about the AI for security. That is more in your neck of the woods. I think it will minimise the – it will shorten the cat and mouse game that is largely played by attackers and defenders by red and blue teams. And I'm really excited to see that. And making security more accessible to non-security developers by having more of that data out there, by enabling these agents, by having these scanning capabilities that allow you to figure out, "Hey, this is vulnerable to prompt injection," without knowing necessarily how to identify prompt injection. I think those make me really optimistic for the future of both the security industry and the technology industry, and every industry that it serves as a result.

[0:31:56] Danny Allan: That is true. And there's no one more excited than myself about that because we use AI for security for discovery of vulnerabilities, for reachability of vulnerabilities, for breakability of updates that we generate. And actually, for generative AI. Generating fixes for these things. It makes it way more accessible. However, and this is my but, the attackers are starting to use AI now as well. I was at an event earlier this week put on by Google that has huge cyber research. And they're seeing very sophisticated AI not only for security but for malicious attackers as well. And so while we're building security in, we should be very aware of the fact that attackers are doing the same thing on the attack side, which makes way more complex breaches, and attacks, and threat vectors that we're facing.

[0:32:48] Nicolas Dupont: That's a very good point. I mean, at least the commonality across all this is that you and I will likely still have a job next year.

[0:32:56] Danny Allan: Security is not going to go away. That is what I always say. Well, this has been very fascinating, Nicolas. Thank you for the insights on this. I have to confess this is an area that I haven't looked into, but I know it will be interesting for all of our developers. If people want to reach you, what's the best way to reach you? LinkedIn, Twitter? Where do they find you?

[0:33:12] Nicolas Dupont: Yeah, I'm on LinkedIn. That's definitely the best way. Nicolas Dupont Cyborg. We're at cyborg.co. as well. And yeah, if you have any questions around vector embedding vulnerabilities, look at the OWASP's Top 10 LLM risks. We're number eight on there. And come talk to us.

[0:33:30] Danny Allan: Yeah, and it's well worthwhile. I saw a demo of Nicolas reverse-engineering some of these things and actually causing a breach. If you're interested, you can look online and see some of those videos. But Nicolas, thank you for joining us today on The Secure Developer. And to our audience, I just say thanks for joining. And we'll see you next time on the next episode of The Secure Developer.

[0:33:50] Nicolas Dupont: Thank you, Danny.

[0:33:54] Guy Podjarny: Thanks for tuning in to the Secure Developer, brought to you by Snyk. We hope this episode gave you new insights and strategies to help you champion security in your organisation. If you like these conversations, please leave us a review on iTunes, Spotify, or wherever you get your podcasts, and share the episode with fellow security leaders who might benefit from our discussions. We'd love to hear your recommendations for future guests, topics, or any feedback you might have to help us get better. Please contact us by connecting with us on LinkedIn under our Snyk account or by emailing us at thesecuredev@snyk.io. That's it for now. I hope you join us for the next one.

Up next

You're all caught up with the latest episodes!