Skip to main content
Episode 169

Season 10, Episode 169

Retrieval-Augmented Generation With Bob Remeika From Ragie

Hosts
Headshot of Danny Allan

Danny Allan

Guests
Headshot of Bob Remeika

Bob Remeika

Listen on Apple PodcastsListen on Spotify PodcastsWatch on Youtube

Episode Summary

Bob Remeika, CEO and Co-Founder of Ragie, joins host Danny Allan to demystify Retrieval-Augmented Generation (RAG) and its role in building secure, powerful AI applications. They explore the nuances of RAG, differentiating it from fine-tuning, and discuss how it handles diverse data types while mitigating performance challenges. The conversation also covers the rise of AI agents, security best practices like data segmentation, and the exciting future of AI in amplifying developer productivity.

Show Notes

In this episode of The Secure Developer, host Danny Allan is joined by Bob Remeika, co-founder and CEO of Ragie, a company focused on providing a RAG-as-a-Service platform for developers. The conversation dives deep into Retrieval-Augmented Generation (RAG) and its practical applications in the AI world.

Bob explains RAG as a method for providing context to large language models (LLMs) that they have not been trained on. This is particularly useful for things like a company's internal data, such as a parental leave policy, that would be unknown to a public model. The discussion differentiates RAG from fine-tuning an LLM, highlighting that RAG doesn't require a training step, making it a simple way to start building an AI application. The conversation also covers the challenges of working with RAG, including the variety of data formats (like text, audio, and video) that need to be processed and the potential for performance slowdowns with large datasets.

The episode also explores the most common use cases for RAG-based systems, such as building internal chatbots and creating AI-powered applications for users. Bob addresses critical security concerns, including how to manage authorization and prevent unauthorized access to data using techniques like data segmentation and metadata tagging. The discussion then moves to the concept of "agents," which Bob defines as multi-step, action-oriented AI systems. Bob and Danny discuss how a multi-step approach with agents can help mitigate hallucinations by building in verification steps. Finally, they touch on the future of AI, with Bob expressing excitement about the "super leverage" that AI provides to amplify developer productivity, allowing them to get 10x more done with a smaller team. Bob and Danny both agree that AI isn't going to replace developers, but rather make them more valuable by enabling them to be more productive.

Links

Danny Allan: First of all, I always wonder, what is an agent by definition of that person? And number two, what are the use cases? And I guess maybe those two questions for you. How do you define an agent first of all? And what are the agents that you're beginning to see in practice today?”

Bob Remeika: People disagree, right? On agentic, the word agentic, right? And I think a lot of people tend to think about agents as being fully autonomous. And that's great, but it doesn't mean that you're not building an agent if you're doing something that's maybe like multi-shot. I guess I would probably draw the line there, right? Where it's like, "Okay. Well, if you're taking the output from one step and feeding it into another, then you're starting to build an agent." Whether or not it's fully autonomous, I don't know. It's semantics.”

[0:00:50] Guy Podjarni: You are listening to The Secure Developer, where we speak to industry leaders and experts about the past, present, and future of DevSecOps and AI security. We aim to help you bring developers and security together to build secure applications while moving fast and having fun.

This podcast is brought to you by Snyk. Snyk's developer security platform helps developers build secure applications without slowing down. Snyk makes it easy to find and fix vulnerabilities in code, open source dependencies, containers, and infrastructure as code, all while providing actionable security insights and administration capabilities. To learn more, visit snyk.io/tsd.

[INTERVIEW]

[0:01:30] Danny Allan: Hello, and welcome to another episode of The Secure Developer. I am your host, Danny Allan. And I am super excited to be joined today by the co-founder and CEO of a company called Ragie. His name is Bob Remeika. And I'm going to let him introduce himself. And we're going to dive into all things related to AI, as we have been doing, and retrieval-augmented generation. But Bob, welcome to the show. How are you?

[0:01:54] Bob Remeika: Good. Thanks, Danny. Thanks for having me.

[0:01:57] Danny Allan: Maybe you can just share with the audience a little bit about your background and how you got to be at Ragie, and kind of the journey leading up to where we are today.

[0:02:08] Bob Remeika: Yeah, I mean, it's kind of a long story. I'll try to keep it very short. I think with Ragie – I had built several AI applications. My co-founder, Mohammed, was doing a lot of consulting. What we're trying to do is ultimately build an application, and we found ourselves working on the plumbing for RAG pipelines for about 90% of what we were trying to build. It just kind of made sense that somebody would provide some infrastructure for that, make it very easy to use, very developer-friendly, and that's kind of how Ragie was born. There's a long winding road to get there, but that's the very short cliff's notes.

[0:02:49] Danny Allan: Well, I think people would have to be living under a rock to not know what AI is and retrieval-augmented generation. But maybe just to set the framework, the scaffolding for the conversation, what is RAG and retrieval-augmented generation?

[0:03:03] Bob Remeika: Retrieval augmented generation is a way of providing context for LLMs that they haven't been trained on. For example, if you were trying to ask ChatGPT about your company's internal parental leave policy, it doesn't know anything about that. But you can use something like RAG where you retrieve data from your systems to augment your prompt and ultimately generate a contextual answer that makes more sense for the question that you asked.

[0:03:34] Danny Allan: And how is that different from fine-tuning an LLM? Because sometimes you hear about fine-tuning LLM, sometimes you hear about the RAG. What is the differences, the pros and cons of each approach?

[0:03:45] Bob Remeika: Well, RAG doesn't require a training step, which is ultimately what you're trying to do with fine-tuning. It's a really simple way to get started with building an AI application. There are places where fine-tuning might help your application. But for the most part, you can get started with RAG. And if you need to, you can always go to fine-tuning.

[0:04:05] Danny Allan: Okay, that makes sense. And what about format? Obviously, in RAG, you're giving it additional context. Is it limited to the same context of the model itself? Can you give it audio? Can you give it video? Can you give it text? What is the data feeding into a RAG system look like?

[0:04:24] Bob Remeika: Yeah. There's a whole thing here with context engineering, and taking different formats, and trying to provide them in such a way that LLMs get the best possible information to help generate responses. What we do with Ragie is we – one of the layers of Ragie is, obviously, you can send us any kind of data. Just think of any file, right? We've seen all kinds of things. We've seen um stuff from like old Canon digital cameras come through. And the data can come through in – there's a variety of file formats: PDF, DOCX, XLS, all the different kinds of file formats that you have to read. But then, also, it can come through in different modes, right?

Often, you're working with documents. But if you need to do something with audio, then now you have a whole different set of challenges when it comes to what sort of audio file can I decode? How should I index this in the best possible way, so that when I go to retrieve it, I'm getting relevant chunks? And so that's really hard, along with video, which obviously video is a whole can of worms. There's the audio component to it. It's multi-channel. And so these are all things that we do a really good job with at Ragie. When you send us pretty much anything, we can decode it and get it into a format for you to retrieve. But there's a lot of work that goes into that.

[0:05:48] Danny Allan: Does it slow down the transformers when you put in the tokens to generate responses if you have a large RAG database, which we can get into, where the data is stored? But does it slow things down using RAG versus fine-tuning?

[0:06:06] Bob Remeika: With RAG, you can – long context windows are a whole new beast, which I'm not sure if you've gone deep on this, but I kind of have. With something like Llama 4 Scout, for example, it has a 10-million-token context window. If you do some very rough math, it equates to around 40 megabytes of data that you can send over into the context window. If you try to send 40 megabytes of context into an LLM, you're going to have a tough time even getting a response back right now. You need to run on very specialized hardware. You need to have network GPUs.

I actually tried to do this, and I was very disappointed. I couldn't really get anything to generate. But, yeah, you can definitely see more slowdowns. There's real accuracy, latency, and cost trade-offs when it comes to long context. Fine-tuning is a little bit different because it's more on the inference side. You've already done the work. But, yeah, both approaches have their pluses and minuses.

[0:07:13] Danny Allan: Long context windows or large amounts of data to feed in, it sounds like it's actually more performative on RAG versus trying to put it into an existing model, unless you pre-fine-tuned the model.

[0:07:27] Bob Remeika: Yeah. Like I said, Danny, if you try to stuff 40 megabytes of context into a prompt, you're going to have a bad time. But most people aren't trying to do that. RAG is a really good tool for you to sort of slim down your context window. I think with long context windows, all it does is really just changes a little bit about how we think about RAG and the amount of data that we can put into it. I mean, it wasn't that long ago that we were dealing with context windows that were like 8,000 tokens. Now we're talking about millions. So, yeah.

[0:07:58] Danny Allan: And do you find organizations – go back two years, and people weren't using AI. And I think now, almost every organization, certainly, I'm sure you are at Ragie, we are here at Snyk, using AI internally. Are most of those organizations training or extending their context using a RAG-based model or just taking a standard model and doing fine-tuning? Do you have a sense of where the industry lies right now in terms of adding context to their local internal LLMs?

[0:08:33] Bob Remeika: I think it really varies. I've seen with larger companies, they're kind of like trying to find the use case oftentimes. I think, like talking to banks, for example. They're still getting their feet wet. And it might be a little bit of bias because we are Ragie. We are RAG as a service. When companies seek us out, generally, they think, "Oh, maybe I need some sort of RAG solution." They've probably heard about RAG in the past. So, we see a lot more RAG, obviously. There are cases where I think fine-tuning could help people in a domain-specific vertical. In which case, fine-tuning will help. But you can use fine-tuning in combination with RAG. You're not limited necessarily to one or the other technique. But we see a lot of RAG, obviously.

[0:09:25] Danny Allan: Yeah. And when you're using RAG or when a company is using a hosted RAG service like your own, are they sending you their data, or is it staying within their bubble on-premises? How does it actually get pulled into the system to get used?

[0:09:39] Bob Remeika: Yeah. We are a very, very simple platform. And this is one of the things that I think is actually a value prop for Ragie. You send us your data. We're fully managed. We will stay up to date with the latest and greatest when it comes to parsing, chunking, indexing, and retrieval. We're sort of like that internal AI team that your company may or may not have, where we're kind of doing that research constantly and trying to find out the best ways to get you back your generations. That's how people typically use Ragie. We also offer a VPC solution. If you need to have your data separate, with your own dedicated VPC, that's something that we can also accommodate.

[0:10:27] Danny Allan: What do you see as the most common use cases for companies today? Is it they're creating internal chatbots for productivity? Everyone is saying, "Hey, I'm using AI. And we're building agents." And we'll get into agents in a moment here. But what is the most common use cases that you're seeing across your customer base or across the industry at the moment?

[0:10:49] Bob Remeika: We surface two very common use cases. I'd say probably one of the biggest use cases, if somebody's trying to build an internal chatbot, they're usually trying to chat with their data. And I think maybe last year, it was probably more read-only, right? People were building applications that really just needed access to the data. I think we're seeing now more people wanting to act on that data as well. That's very interesting from a use case standpoint.

And then the other thing that we service, and I think it's one of the things that really sets us apart, is that people are trying to build applications for their users. If you need to build an application or you need to take – you want to AI into your application, Ragie makes it really easy for you to do that using something called Ragie Connect, which you can think about it as like I click a button and now I'm asking my user to connect their data. It automatically gets synced, and now it's usable by my application.

[0:11:53] Danny Allan: How do you control the authorization in that world? Let's say they connect that button, and it pulls in the data for the application, or you're doing it for an internal system. How do you prevent the one employee from getting access to the CFO's financial directory of information? How does that work in a Ragie-based world, or just any SaaS kind of based world?

[0:12:19] Bob Remeika: It's a very common question. We do this in a few different ways. One, we offer data segmentation. You can separate your HR data from your finance data using partitions, which draws a hard-logical separation around your data. You can't access data from either or. Another way you can do that is you can use metadata. You can tag your data on the way into Ragie. You can filter on the way out. And you can often times build an RBAC-type system using just that primitive.

And then the final way is what we call source-level permissioning. And so the way this works is if you have, let's say, connected to Google Drive, right? And a user is trying to access – a user gets retrievals back. They get their chunks back from a Google Drive. There's a post-filter check that basically says, "For each one of these chunks, does this user have access to this document?"

And when that's really nice is when you want to make sure that maybe I've set some permissions in Google Drive, or I've set something in Jira, and you want to honor the source-level permissioning. And so those are the ways that people typically will try to make sure that authorization is handled.

[0:13:42] Danny Allan: And what about abstraction? I can imagine a world where, I don't know, you want to expose this to your customers. And so you feed in all the customer support cases. But, of course, you don't want one customer accessing support cases from another one. You want to abstract the specific details. Is there ways to put the data in, but abstract it in such a way that it doesn't get connected with a specific account or specific user? Is that something that is by default right now in the AI RAG-based world?

[0:14:10] Bob Remeika: I think a lot of times people handle that with metadata and filtering. We have customers that are doing this right now. Typically, they'll use a combination of partitions plus metadata, and filtering. But we also have people that want to be able to access data across their entire knowledge base. And so, in that case, they just use metadata and filtering.

[0:14:31] Danny Allan: Okay, makes sense. That's one use case, is people are pulling in the data and creating chatbots to make productivity. You said the second one is agents. I'm super interested in this because I hear so much, "We're leveraging agents all over the company." But, first of all, I always wonder, what is an agent by definition of that person? And number two, what are the use cases? And I guess maybe those two questions for you. How do you define an agent first of all? And what are the agents that you're beginning to see in practice today?

[0:14:59] Bob Remeika: People disagree, right? On agentic, the word agentic, right? And I think a lot of people tend to think about agents as being fully autonomous. And that's great, but it doesn't mean that you're not building an agent if you're doing something that's maybe like multi-shot. I guess I would probably draw the line there, right? Where it's like, "Okay. Well, if you're taking the output from one step and feeding it into another, then you're starting to build an agent." Whether or not it's fully autonomous, I don't know. It's semantics.

[0:15:34] Danny Allan: Yeah, I tend to think of agents – my own kind of version of it is agents typically take an action of some type as opposed to providing read-only context, as you said earlier. What are the types of agents that you're beginning to see? Just maybe some practical examples for our audience.

[0:15:52] Bob Remeika: Well, I guess Danny, my only push back on what you just said, actually, is that you can build agentic flows that are read-only. A good example is like what we're working on right now. We're basically building agentic retrieval. And what we're doing there is, under the hood, we're taking a complex query and we're basically breaking it down into multiple steps. And each step has some concept of memory built into it, right?

Let's say you're traversing a tree of questions, right? Your first question is very general. We've now broken this down into four different steps. Each one of them with their own retrieval step, right? And each one of those steps comes back with some sort of tidbit of information that helps answer the original question. We roll this all up, and we're able to answer that first pretty wide general question that you weren't able to answer with just a single-shot query. And so in that world, it's not really read-write. You're not necessarily taking action, but it is multi-step.

[0:16:57] Danny Allan: Yeah, that makes total sense to me because they're kind of sub-agent or sub-tasks, if you will, to the parent one. Do you worry more or less about hallucinations in that world? Or will one of those agents be basically verifying the other three? Is there less concern, I guess, in that space around hallucinations, which I know sometimes comes up in the AI space?

[0:17:21] Bob Remeika: I think it's definitely a concern. I mean, what we're seeing is – so far, our initial testing is that we're seeing many more answers come back correct. For example, one of the eval frameworks that we've been using to test how good our retrieval is it's called FinanceBench. And FinanceBench is actually a really hard data set to get correct answers out of. Think of it as like a bunch of open public financials data from a ton of different companies, like 10Ks, that sort of thing, right? nd the questions are pretty crazy in some respects. Sometimes they require actually farming out to a traditional search to get more context so you can understand terminology and that sort of thing.

We're seeing on those results – and it's still kind of early, but we're seeing a massive improvement, a massive improvement in the kinds of questions that people are able to ask and get contextual answers to, right? But with every step, there is a potential for a hallucination. And so if it goes off the rails, it could go off the rails. We're doing a lot to sort of have these additional checks, like you mentioned, to make sure that it's not going off the rails. But, yeah, there are challenges for sure.

[0:18:45] Danny Allan: Yeah. I actually tend to think of those multi-process AI tasks as being more secure in some ways because you can have one as a bit check. In the old days, you'd have your bit checker on the computer to check the RAM to make sure a bit didn't flip. You can essentially use one of those tasks to verify the other tasks for building in security to the AI, to the agentics flows of the system. Anyway, we'll see. It's definitely an evolving and interesting space. I definitely hear of hallucination as less of a concern now than it used to be. I don't know whether you're seeing the same thing.

[0:19:26] Bob Remeika: Well, I think, generally, RAG systems help with hallucinations in general. We tend to have pretty good results just using our regular retrieval, partially because we're grounding those generations in actual facts. Assuming that we're getting back the correct chunks, which we also do. There's reranker techniques that we use as well to make sure that the chunks are actually relevant. Yeah, generally, the hallucinations are improving. I think models are getting better too. That helps a lot. So, yeah.

[0:19:58] Danny Allan: What are the best models right now that you're seeing from your customers? Or are you allowed to have favourites?

[0:20:04] Bob Remeika: Well, I don't know. Look, this is like – I'm going to get in trouble here with my team probably. I don't know. Actually, I don't think I'll get in trouble. There is one person that's a super Grok fan. I'll just tell you. Super Grok. And so cool. I don't use Grok a lot personally, but I've been using Sonnet. I am probably a huge fan of Sonnet at this point. But we use OpenAI for a ton of stuff as well and Gemini.

It kind of depends on what I'm trying to do. I think, typically, I'll use Gemini if I'm trying to write something, produce a blog or something like that. I love Sonnet for code-related tasks. OpenAI is still probably my go-to when I'm using my Google search replacement at the moment. I'm not necessarily on the perplexity train right now. I've been using OpenAI for a lot of that. But I'm also a huge fan of OpenAI for their faster, smaller models. I use those a ton for – and kind of like the steps that you were talking about before, where you have like is this a correct answer or not type of prompts. I've been using o4-mini for a lot of that stuff. It's just really fast and cheap.

[0:21:17] Danny Allan: Well, and that jives with what I'm seeing. If I'm on my personal kind of search query, the internet, I tend to use OpenAI as well. But when you're talking code, Claude – this is my own experience, personal. But I found Claude Opus or Sonnet to be the best. And then I know a lot of people internally swear by Gemini 2.5 as well. But, certainly, the models are evolving very, very quickly. And I think that will address a lot of the hallucinations. When it comes to security, are you seeing people building security into the agents, not worrying about security? It's an afterthought. How do you see security in the context of Ragie or people building things on top of Ragie?

[0:21:59] Bob Remeika: It's definitely not an afterthought, especially when it comes to larger customers, especially ones that are concerned about where your data lives and how it's handled. I think we see a lot of questions being raised when it comes to data on input. We're seeing people want to redact PII, for example, that sort of thing. There are other vectors, as you probably know very well, Danny. I get asked about these less, right? Prompt injection, that sort of thing. I think these are things that are starting to come up more as people learn what the different attack vectors are. But we're seeing it mostly, I'd say, when it comes to how we're handling data.

[0:22:44] Danny Allan: Yeah, prompt injection tends to be at the top of the list, they'll say. People are trying to break out of what the model has been restricted to do. And it's a hard problem, quite honestly. Because if you think of the traditional days of security, which was very deterministic, you'd go out of bounds on what it was supposed to accept, and it would cause an error or cause a problem. Of course, everything is accepted in an AI world. How it's accepted and what is done with it? Non-deterministic systems are far harder to measure how they react to things, especially in a sequential world.

How do you measure — so a customer comes to Ragie or comes to a RAG system and is building an AI native application. How do they measure return on investment of this new application? Obviously, they're investing in it. How do they determine whether it's actually giving them a financial return? Or maybe that's not what they're looking for? One of the issues that I've seen in our customer base is, "Hey, I want to adopt AI. I want to use it." But they don't have – well, the question is to you. What is the measurement stick?

[0:23:50] Bob Remeika: Well, I think it's very interesting that you asked that because, I think, there's sort of a psychology that comes to developer tools where a lot of times people are thinking in terms of usage-based pricing and etc., right? Am I getting my $100 worth here? That's very typical when it comes to developer tools, right? And you're typically working with engineers, and they're concerned about how much disc size they're going to take up.

I think often times what people maybe building – and maybe they don't even know yet. I mean, this is also something that we run into quite a bit, that people are building something brand-new for the first time. So they don't know what the ROI is going to be on this yet. But, really, when I talk about Ragie and what people are building with their application, I try to put it in terms of like, "Well, how much person time are you saving?" Right? Is the application that you're building going to add so much value that the ultimate costs of running the generations or maybe using a service like Ragie is actually negligible?

I mean, I think people are starting to get that as they build more and they see more value and they're like, "Hey, I no longer have to do this task anymore. I just have an agent doing it." And then you really see the value. But until you've done that, it's kind of hard to get past that. And I think because people are building all these brand-new applications, they maybe haven't seen that yet. But they will.

[0:25:14] Danny Allan: Yeah, it's interesting. It's one of the times when I see the enterprises actually outpacing the smaller businesses because they have a better handle sometimes on what the costs are of those people resources. Where the smaller companies, they just know that they're spending $10,000, $50,000 on this new system, and are unsure how to quantify what the return is on that in.

On RAG-based systems – and I know I'm pivoting a little bit. But this is more for my own information than anything else. Where is the data stored? Does it go into a vector database? And how does the future of vector databases impact a RAG-based world?

[0:25:50] Bob Remeika: That's a great question, Danny. I'm glad you asked because I just don't think that we're just talking about vector databases anymore. I mean, it's just like a vector database is one way to search. I often think it's not sufficient on its own. And so when I pitch Ragie, and go back to the beginning of the show where I mentioned indexing data, I use the word index intentionally because we do a lot more than just vector database at this point, right?

I think table stakes is at least hybrid, which is what we do by default. We'll have our vector store, obviously, and then a BM25 keyword index as well. And that's important because if you ask something like, "What is a TS99 error code?" Right? With just a pure semantic search or vector search, you might get something back pertaining to all error codes, right? But that's not what you care about. You care about TS99. That's where the keyword matching helps.

But you can go further on this, and we do go further on this. We do something called hierarchical search as well, where you can basically take – a very common problem in RAG applications is that you'll get a cluster of results in one document or from one document. Let's say if you ask for a top k of eight, you want eight chunks back, and seven of them match one document, and maybe one matches another document, but you know you have 10 other documents that have relevant context, you're going to lose that context. We do something called hierarchical search, which is really like a two-step process to make sure that we can spread out the results across documents and give you a more diverse answer. This is just like one of many techniques that you can employ to build these applications. And so vectors, yes, important, very important, but not the only thing that you need.

[0:27:45] Danny Allan: What are the limitations right now in the industry? Or what is holding us back from unlocking greater context, greater smarts, AGI if you want to go all the way to that? But what is the technology limitation for us at the moment? Is it the context windows themselves?

[0:28:04] Bob Remeika: I don't think so. I mean, I think, yeah, context window is definitely a thing. But I don't think – I don't see customers dropping 10 million tokens into a prompt or even expecting to be able to do that. I feel like that's pretty sufficient at this point. Honestly, what I think the limitation is, is that when we started building rag applications, we started with this vector search, right? And so that's kind of where everything began, right?

And then the promise of AI really hit, and it was like, "Oh, I can ask any question, and AI will tell me the answer." And that's just kind of like not how it works, right? But that's what people think, and that's how they think, and that's what we would to answer as well, right? I think the next step is really – and this is why we're building this. I mean, we're working on agentic retrieval right now to answer those questions. How many units did we sell last quarter, right? And provide not only an accurate answer, but also one that's deeply contextual with data retrieved from your knowledge base as well.

[0:29:09] Danny Allan: What makes you most excited about the future? If you look out on where we are today, where do you think all of this is going in the long term?

[0:29:18] Bob Remeika: I got asked this recently, and my answer is super leverage. I love the idea of being able to take something that maybe I'm already good at, let's just say it's code, and this could be a lot of different things for different people. For salespeople, it could be prospecting or whatever, right? I like to be able to have these agents that amplify how much I can get done.

My favourite story is, just the other day, I was building two features at the same time in Cursor, just going, "Wow, this is incredible. This is such a massive time savings." I mean, what I was personally building probably would have taken me a week or two. I was able to get it done in a few hours. I'm really excited about that.

[0:30:06] Danny Allan: Do you think companies take that – everyone says AI is coming for our jobs, and it's going to reduce the developers. Well, I won't say quite yet what I believe.

[0:30:15] Bob Remeika: Sorry, I already jumped in.

[0:30:17] Danny Allan: No, no, no. But do you think that AI then is going to reduce the number of jobs needed, or is it just going to unlock potential that was not possible in the past?

[0:30:27] Bob Remeika: I mean, I'm not going to speak to all jobs. I just haven't done all jobs. Maybe I'll just stay in the developer lane for a minute here. No, it's not going to take your job. If anything, if you think about – if anything, people that know how to leverage AI in their their job, they're just going to be that much more valuable. And for companies, it's just going to be a massive time saver as well as a value ad, because you're going to be able to get, let's just say, 10x more done than you could have on your own with a smaller amount of people. If you're getting 10x more done, why would you lay people off for that? You know what I'm saying, Danny?

[0:31:07] Danny Allan: Yes. Yeah. Yeah. For sure. It allows your company to grow that much faster and further than all the others. Everyone says that the moat in the AI world is speed. And I think that things – well, I know that things are moving far, far faster because we're so much more productive. But there's still technology differentiation underneath. It's not purely speed as the differentiators. It still is how do you use RAG in an effective way? What you're doing at Ragie. Or for us, how do you do security testing? That is still very much a fundamental differentiator. It's just that you're going so much faster because you're using AI.

[0:31:44] Bob Remeika: I'd add to that by saying that you can take somebody who is maybe not – doesn't have 20 years of developer experience or whatever. And, all of a sudden, they vibe code, but they're just making like a bunch of trash.

[0:31:59] Danny Allan: Yes.

[0:32:01] Bob Remeika: If you kind of already have a sense for you know what you're doing, now you can just take this tool and amplify your productivity.

[0:32:11] Danny Allan: Yeah, it's definitely true. I would consider myself very well-versed in Java, and C#, and Python, but I don't know Rust at all. But the other day I was vibe coding. I was trying something out. I wanted to build it in Rust. And so I was using AI for Rust. And I would never have been able to do that before, and it certainly helped accelerate that.

[0:32:32] Bob Remeika: I talked to a hardcore Rust person the other day, and they told me that vibe coding with Rust was not possible.

[0:32:41] Danny Allan: Ah. Well, it was generating code for me at least.

[0:32:45] Bob Remeika: Did you run it, though?

[0:32:46] Danny Allan: It built. It did what it was supposed to do. I don't know that it was secure, which probably should concern me as CTO of Snyk here. But I was just doing it for personal – I like to play around with the vibe coding, the coding assistance in my free time.

[0:33:04] Bob Remeika: And it's such a great tool to help you learn. I mean, to your point, you didn't necessarily know Rust. But you can just kind of go in and ask questions. And if you have a sense for how things probably should work, it's really easy to learn that way, too.

[0:33:18] Danny Allan: Yeah. But just to end on one question. If the audience should know one thing about RAG that they don't know today, what is it about RAG that is the biggest misconception or something that they should understand in how to leverage it most effectively?

[0:33:35] Bob Remeika: Biggest misconception. I feel like this is changing over time, Danny. Maybe last year, I would have said that RAG doesn't do everything for you. And to an extent, that is true, right? If you just hook up your Snowflake instance and it has a bunch of tables in it, and then you want to use RAG on that, well, that's a totally different architecture than what traditional RAG system looks like. I think that's starting to change, though, now because things like agentic search. For example, combining tools like structured RAG and structured retrieval along with unstructured retrieval, I think that's opening up. Maybe those misconceptions that happened about a year ago, those gaps are closing a little bit. But that's probably where I'd go with that.

There's also this misconception that you might need to not do much. You just connect your Salesforce data. But in order to index that well, you actually have to do a lot of work. You have to make some decisions about how that data will be used during retrieval. And so maybe my best answer for this is that it's not as simple as you think, but our goal is to make it simple.

[0:34:52] Danny Allan: That's good to know. And I would echo your comment. I mean, back earlier in the conversation, we were talking about RBAC and authorization, those types of things. It's not as simple as just connecting data sources. Cleaning it, and sanitizing it, and structuring it actually does make a lot of difference to get the final product that you're trying to get. Now, maybe eventually AI will solve some of those issues. But right now, there is still work to be done. In the same way that putting things into a data warehouse. You need to clean things up and structure it in such a way that you can get data out of it effectively.

[0:35:22] Bob Remeika: Exactly.

[0:35:23] Danny Allan: Well, Bob, it's been great to have you on the show. Thank you for sharing your insights on RAG. I have to confess, I didn't know some of these things myself. And best of luck. If people want to reach out and learn more about Ragie, where should they go?

[0:35:35] Bob Remeika: Ragie.ai. Go to our website and sign up.

[0:35:41] Danny Allan: And to contact you, what's the best place? Are you on LinkedIn? What are your socials? How do they contact you?

[0:35:46] Bob Remeika: I'm on LinkedIn as Bob Remeika. Twitter as Bob_Remeika.

[0:35:53] Danny Allan: Awesome. Well, thank you, Bob, for joining our show. And thank you to all our listeners for joining us again for another episode of The Secure Developer. We'll see you next time.

[OUTRO]

[0:36:04] Guy Podjarny: Thanks for tuning in to The Secure Developer, brought to you by Snyk. We hope this episode gave you new insights and strategies to help you champion security in your organization. If you like these conversations, please leave us a review on iTunes, Spotify, or wherever you get your podcasts, and share the episode with fellow security leaders who might benefit from our discussions. We'd love to hear your recommendations for future guests, topics, or any feedback you might have to help us get better. Please contact us by connecting with us on LinkedIn under our Snyk account or by emailing us at thesecuredev@snyk.io. That's it for now. I hope you join us for the next one.

Up next

You're all caught up with the latest episodes!