At the rate at which AI is infiltrating operations around the globe, AI regulation and security is becoming an increasingly pressing topic. As external regulations are put in place, it’s important to ensure that your internal compliance measures are up to scratch and your systems are safe. Joining us today to discuss the security of ML systems and AI applications is Ian Swanson, the Co-Founder and CEO of Protect AI. In this episode, Ian breaks down the five pillars of ML SecOps: supply chain vulnerabilities, model provenance, GRC (governance, risk, and compliance), trusted AI, and adversarial machine learning. We learn the key differences between software development and machine learning development lifecycles, and thus the difference between DevSecOps and ML SecOps. Ian identifies the risks and threats posed to different AI classifications and explains how to level up your GRC practice and why it’s essential to do so! Given the unnatural rate of adoption of AI and the dynamic nature of machine learning, ML SecOps is essential, particularly with the new regulations and third-party auditing that is predicted to grow as an industry. Tune in as we investigate all things ML SecOps and protecting your AI!
Episode 134
Season 8, Episode 134
The Five Pillars Of MLSecOps With Ian Swanson
Ian Swanson
Ian Swanson: “If we take a look about what's happening within the White House, which is a blueprint for AI Bill of Rights, there's four key things that is being discussed right now. Number one, the identification of who trained the algorithm, and who the intended audiences. Number two, the disclosure of the data source. Three, an explanation of how it arrives at its responses. Four, transparent and strong ethical boundaries. We have to have the systems built to govern these because the penalties could be severe.”
[INTRODUCTION]
[00:00:33] ANNOUNCER: Hi. You’re listening to The Secure Developer. It’s part of the DevSecCon community, a platform for developers, operators and security people to share their views and practices on DevSecOps, dev and sec collaboration, cloud security and more. Check out devseccon.com to join the community and find other great resources.
[EPISODE]
[00:00:55] Guy Podjarny: Hello, everyone, welcome back to The Secure Developer. Thanks for tuning back in. Today, we're going to embark on a bit of an AI security journey, which I think will be fun and interesting. To help us kick that off here, we have Ian Swanson, who is the Co-Founder and CEO of Protect AI, a company that deals with AI security, and specifically runs ML SecOps, which is a great information hub for defining what is AI security, or at least what is ML SecOps, and we're going to dive a lot into that. Ian, thanks for coming on to the show.
[00:01:26] Ian Swanson: Thanks, Guy. It's awesome to be here. It's a great podcast, really excited to talk about the security of ML systems and AI applications.
[00:01:33] Guy Podjarny: I guess, just kick us off in context a little bit. Can you tell us a little bit about who you are? Maybe a bit of background? I guess, how did you get into AI security in the first place?
[00:01:43] Ian Swanson: I've been in the machine learning space for 15 years. It's been an area that I've been incredibly passionate in. I've had multiple companies in machine learning. I had an ML centric FinTech company called Symmetrix that I sold to American Express. Then, I started a company called DataScience.com, that back in 2014, was setting the groundwork of what is today known as an MLOps. That company was acquired by Oracle.
Today, I'm the CEO of Protect AI. Protect AI is all about security of ML systems and AI applications and we think the time is now. In terms of why I'm passionate about this space, well, prior to starting Protect AI, I was the worldwide leader of go to market for all of AWS, AI machinery. My team worked with tens of thousands of customers, and I've seen the rapid rise of adoption of AI and machine learning across all key functions within a company. As we think about the evolution of adoption of AI, yes, how do we get these models in production? How does it drive digital transformation? How do we make sure that we de-risk them from ethics, trust, and bias? But what about protecting it commensurate to its value?
If the CEO of JPMorgan Chase, Jamie Dimon and his shareholder letters talking about the rise of adoption of AI with hundreds of applications within that bank, then we better make sure that we protect it commensurate to its value. I think that moment is here today. That's why we started Protect AI and it's another reason why I'm just so passionate about this space is I believe in the potential of AI, but also understand the pitfalls within ML systems and AI applications.
[00:03:21] Guy Podjarny: Yes, absolutely. I totally relate to the urgency around this, especially with almost like unnatural rate of adoption due to both usefulness on one hand, and maybe competitiveness on the other thrown in some hype. A lot of these systems are going in so much faster than past notable technologies, even clouds and containers and other capabilities that relative to their prior trends were adopted quickly. Let's dive in. We’re passionate about it. Before we dig into ML SecOps, you chose to call it ML SecOps, not AI. Do you want to share a quick view on how do you separate AI and ML?
[00:04:00] Ian Swanson: It's kind of funny. A lot of people use it interchangeably, where ML, machine learning is a subset of AI, and you can think of deep learning as a subset also, within the AI category. The reason why we focused on machine learning, at least from a messaging perspective, is a little bit of the core persona that we're working with that is at the centre of operating ML systems. What I mean by that is the role of an ML engineer. So, think about parallels within software development. But on this side, the person who owns the CI/CD, the software development stack, if you will, for ML systems is an ML engineer. We think it's the ML engineers, they own the systems, they also own the responsibility of, yes, getting models into production, but they need to think about how do we protect those models? How do we secure those models? How do we make sure we're working with the data scientists, practitioners, the line of business leaders that we are de risking those models?
We really wanted to pay homage to that persona and also to this space and this skill set within machine learning. Now, you use machine learning to build AI applications, right? So, that's definitely down the stream. But we think it's really critical for us to focus on the machine learning development lifecycle.
[00:05:14] Guy Podjarny: That's useful, I think, both from a practicality perspective and general definitions, because I indeed, oftentimes think about AI is the value proposition. Eventually, produce me some artificial intelligence, please, with machine learning being maybe the primary means of achieving that. With that, let's dig in into this of ML SecOps. That's a new term. I've been part of the DevSecOps. I have my love/hate relationship with the term. Help us a little bit with some definition. What is ML SecOps and then we will start breaking it down.
[00:05:44] Ian Swanson: ML SecOps stands for machine learning security operations, and it's the integration of security practices and considerations into the ML development and deployment process. Now, why is it different than DevSecOps? It goes back to what I was talking about previously, that the ML development lifecycle is different than software development. Software is built on requirements provided during the first phase of SDLC . But machine learning, the model is really built based on specific data set. Software systems likely won't fail once they're deployed, as long as requirements are not changed.
That's not the case with machine learning. The underlying characteristics of the data might change and your models may not be giving the right result, and it's dynamic. Machine learning is again, not just about the code on that side. But it's this intersection of data, the machine learning model building an artefact and the pipeline, that yes, turns into code and a model that gets deployed. But it's dynamic. It's constantly learning. The data is changing. So why, again, a new category is machine learning development lifecycle, it's just different than standard software development lifecycle. That really goes into why is the need for ML SecOps.
[00:07:01] Guy Podjarny: Before we go there a little bit. It sounds like you’re emphasizing not just the different tools of like, look, I need to be able to inspect this or inspect that. But you're describing a different nature of those types of systems, not just almost the predictability or it's just a different form of agility, instead of it might change with every request that flows through the system, or the data come in. You find that to be more important or more substantial than the specific tools or phases that might be introduced into the development lifecycle.
[00:07:31] Ian Swanson: There are some different tools that is used in the ML development lifecycle. There's also different users here. We have data scientists, ML practitioners that might not be well versed in typical or best engineering practices.
If we think about what are the differences, there's four that I can highlight. Management, GRC, development, and audit of building and why this is different than software development. We think about management. Your traditional code concerns might be on change states, whereas in the machine learning world, there are dynamic conditions that can equate to new or different threats.
On governance, risk, and compliance, we have rules, we have terms. In software development, where a lot of GRC for machine learning is on use and impact. That's another kind of risk profile. The development, you might be working in proxy environments as you're building software, but a lot of machine learning practitioners are working on live systems. So, think about that. We're using a tool, for example, let's say a Jupyter Notebook in a live system that's connected to super sensitive data, pip installing open source packages, exploring that data, building these experiments on that side, all before you’re checking in or committing any code, and it might be an experiment that again, is live.
[00:08:47] Guy Podjarny: A constant deployment in production and/or developing in production coming along customer complain about something, you come in, you change their production system. I think everybody has some battle scars from that type of surrounding.
[00:08:57] Ian Swanson: Yes, absolutely. This built into the rise of ML ops as a category. We really saw ML ops rise in the form of enterprise software solutions and adoption between the years 2014 and 2018. Now, when you go and talk to any of the Fortune 500, Fortune 1,000 companies, they understand that term. They have tools, they have solutions there. But they also have many different flavours of ML ops tools. It's not necessarily the wild west, but it is still somewhat shadow IT. It's a black hole where the ML development lifecycle is not well known or thought about as it relates to security and risk. That's where we think that there's an opportunity for further education.
Hence, going back to the question, which is ML SecOps, and what is ML SecOps. We think that that's a new area that people need to be informed about, understand the differences, the risks, but also the opportunities, the more harden in their systems and build better AI applications.
[00:09:55] Guy Podjarny: What was the fourth again?
[00:09:57] Ian Swanson: The fourth is audit. So, as we audit software, it's oftentimes down to like version controls. If you think about machine learning, it's a lot around provenance changes. It's understanding the different elements that are within that machine learning development lifecycle, and understanding provenance change, perhaps on a training dataset on that case, and that is super critical in terms of what is the end artefact, the end artefact being a model that goes into production or an AI application.
On the audit side, there's also just less transparency. There's less systems in place for these audits. A lot of it is held within the headspace, if you will, the ML practitioner that's building these systems. There's a lack of attestation, there's a lack of provenance, and lack of controls here. That's an area that is different as we think about security within the machine learning development lifecycle.
[00:10:49] Guy Podjarny: That last one is maybe one that we're a little bit more familiar with, when it comes to new technology. That fourth one is a bit of a typical type of problem that exists when a new technology gets introduced, which is, practices are not awesome, and they need to be evolved. While the first three are a fascinating view that come back to your definition of ML SecOps which is, it's a system in flux. It's a system that is constantly changing, because it learns from the data and it is guided by data, versus guided by specific instructions, which makes it a lot harder to assess, a lot harder to govern, a lot harder to version, or understand how differences happened in a milestone fashion. You need to adapt your security practices to something more continuous.
It's interesting, because reflecting on DevSecOps and its narrative, to the degree that you're describing now, it's the next level of speed before we had our annual releases or semi-annual releases. From there, we went to continuous to Agile maybe for internal releases, and then to DevOps, and continuous deployments, and security struggles to keep up because of the pace of change. You're almost saying, in ML, you shouldn't even consider these audit rates because you have to think about something that is naturally continuous and fluid. Otherwise, you're probably never going to be at pace.
[00:12:15] Ian Swanson: You're right, that it's always changing, always evolving, super dynamic on that side. Audit is still going to be a massive category within the space, especially with new regulations that are coming out, and perhaps we could talk about that a little bit more later. But some of the last just to frame this picture, again, on the ML development lifecycle, it starts with the data. You explore, you validate the data, you wrangle the data, you create new datasets, training, test, validation datasets. It then turns into the model engineering, the feature engineering, the hyper parameter to how do we evaluate the model? How do we package the model? We serialise it maybe in Pickle, Onyx, other formats. We then serve that model, deploy that model. We have code for integration testing. We monitor we log and, oh, by the way, we start the cycle all over again.
That's what that last point of like, it's dynamic. It's this living, almost artefact on that side, as it takes an inference and inputs. It's creating new data. So, from an audit perspective, we have to understand those changes and we need to be able to, yes, keep a record of that. We have to have true attestation. We have to have true provenance. But we also have to make sure that we live in this dynamic environment. It's really interesting space to be in. As we talked to these ML engineers, it's what they're working with every single day. Hence, why we're focusing on ML and ML SecOps is yes, and artefact, is an AI application. But it's really this stack of the ML development lifecycle that we need to harden.
[00:13:47] Guy Podjarny: Great points, and we're going to dig in. They keep holding you from breaking up ML SecOps for us on it. We're going to get to that in a moment. Maybe one other bit of context, and maybe we'll use that as we go through it, which is that I'm sure everybody listening, they fall into different categories of use of AI. Fine, there’s the ones that are not doing that right now, and maybe they're listening just because they want to learn. But I think, I guess in my mind, I've always thought of, there's the model builders, people that are actually straight up, either they're building the model, or the training the model. They're actually properly generating the model, so one class of organisation. Then you're going to have another class of organisations or applications that are fine tuning. Especially, in the world of GPT, or more general purpose, that seems to be a growing possibility. There's probably a larger portion of organisations that might be doing that, and then ones that are building a new GPT, or a new training alum. Then, there's ones that are just using it.
I'm just introducing some chat interface. Maybe I gave it some access to some of my data. But I'm just embedding it into my applications in my data, but I'm not overly like, maybe it's lightweight fine tuning or anything. Is that a legitimate way to classify uses of AI? Does it change substantially the type of threats that one might need to be worried about?
[00:15:00] Ian Swanson: Yes. To look at that, you have companies that are builders, you have companies that are adopting, you have companies that do both on that side. So, when I was the worldwide leader of AI and machine learning at AWS, we had AI services that were just out of the box, if you will, algorithm services for personalisation or fraud detection. Then, we worked with companies that would adopt our ML ops tools like SageMaker and be building their models.
But within the problem, or let's say, surface area of both those options, it's really a supply chain question. Are you at the beginning of the supply chain, or towards the end of the supply chain? If you're a builder, you're adopting open source technologies, prior research, you're building models through every single stage of the development lifecycle, and you have to understand that complete supply chain. If you're an adopter of AI, and you're working with vendors on that, you need to say, “Okay, how is this being built? How can I trust it? What is the data that it is trained on?” If it's my data that's being trained on, then that's part of my supply chain that I need to think about.
I think you're absolutely right that there are companies that are building a lot of their own IP, but usually on the backs of prior art. Then, there are companies that are adopters and buyers of solutions like ChatGPT, and threats are across both. But I will say, throughout my career in this space, a lot of people have said how many companies are actually building their own models? How many data scientists are there and ML practitioners? How big are these ML teams? I've been at organisations where we have seen that we've had tens of thousands of customers, large customers that have teams of not just dozens of data scientists, ML practitioners, but hundreds, and in some cases, well over a thousand. There's been a lot of ML building and AI adoption, prior to just what we've seen since January 1st.
[00:16:54] Guy Podjarny: Yeah, no doubt. This is just the recent craze. I guess, in my sense, people have switched to believing in AI, thanks to ChatGPT. AI has been around, has been growing, has been significant in our lives for a while and has been picking up momentum. But I think what ChatGPT has done is it's captured people's imagination and for good and bad, definitely a lot of bad on the security lens of it, as viewed a certain pace that today we have to accept this reality and see what we do about it.
I think those are very good points and say, don't discount the ML actual engineering, actual consumptions. If you're more on the consumer side, that you still need to have an ability to assess the thing that you are consuming. Also, I guess my sense from conversations is, everybody's feeding at least some amount of data into it. You're probably not just putting a straight up, ChatGPT pathway in your application. You're probably trying to modify its behaviour in some capacity.
I guess in that context, let's indeed dig in. How do you break apart the ML SecOps world?
[00:17:59] Ian Swanson: First, let's understand here the importance of machine learning, and then, we'll dive into the five key pillars of ML SecOps. An example I like to give is that if you're a financial services company institution, machine learning is used every single time money is moved around that financial services company. Think about that. It sits at the heart of a company. It's one of its most critical functions. Now, we want to break down what we are calling the five pillars of ML SecOps, and it is so important.
Number one, supply chain vulnerabilities. We need to understand a supply chain. What is being used in building these ML models, these AI applications. Number two, the model provenance. Understanding the true bill of materials, if we will, for what we're using in terms of the ingredients, the recipe of the model. Number three, governance, risk, and compliance. Number four, trusted AI. Fairness, ethics, bias of models. And number five, perhaps a little bit more future state, but adversarial machine learning.
[00:19:02] Guy Podjarny: I think those are useful titles, and some sections feel a bit more natural to me. But maybe let's break them down, given the criticality. By the way, maybe just as an analogy, you talk about companies that already revolve around AI. I also find that in many companies, even if they don't revolve around AI, there's a temptation before the chat interfaces, even in that specific world. There's a temptation to create slightly old powerful proxies, or rather enable the chat system to be able to access a lot of a variety of API's.
Even if the company doesn't end up intentionally revolving around AI quite yet, or maybe they don't know it, they end up exposing some critical information, critical actions, kind of under that mantle, or that opportunity of creating a brand-new user engagement model through that chat. Either way, I think we're fairly aligned that it is critical. Let's dive in into each of these five. It feels like they almost go also from a good software anchoring that people would find it easier to understand, the things that are a bit more domain specific. Start with supply chain. How does that manifest in the world of ML?
[00:20:13] Ian Swanson: As we think about the supply chain itself, again, it's not just code. It's not just open source packages that are used within the models. We have to truly understand the supply chain, as it relates to the data, how we build these models, and yes, the code. If we think about this in analysing and creating, if you will, this bill of materials and tying it to Executive Order 14028, that bill of materials in the machine learning world is different than your typical SBOMs.
As we both know, there are multiple organisations that help set standards for SBOM, CycloneDX, SPDX, they've actually come out and said, there needs to be a new version of a BOM. What they're saying is an AI BOM. We call it a machine learning bill of materials, they're calling it an AI BOM.
Now, why is understanding supply chain so critical within this space? Well, it's one of the most important or I say, more risky, if you will, threat surfaces within ML development. Most machine learning and AI is built off the backs of prior art, not necessarily original art. Data scientists, ML practitioners are commonly using open source packages, frameworks. They are diving into foundational models as the basis of what they're building, and also utilizing a lot of academic research papers that are out there. So, they're leveraging a lot of prior art to build their own IP that powers functions within their business. That's part of the supply chain that we need to understand along with the datasets that are being used and how those datasets are being used in every single model and changing over time.
[00:22:00] Guy Podjarny: Yes, interesting. So, it has the regular software components. I guess, IP, to an extent has always been also something that you should understand which IP comes into your system. But it was a lot less algorithmic. It was rare. It was really – typically, the conversation has been a lot more about licence. Am I allowed to use this? Which I guess, also, is true here. But you're saying, I guess for these models you're consuming or such, does that slip a little bit into that model provenance side? Or is it just to find weaknesses in these models so you should keep that bill of materials? Say, well, I've used whatever LLaMA or I've used stable diffusion, and some versioning elements of them to know that there was a weakness there that I have to upgrade, just like you would a software component.
[00:22:44] Ian Swanson: Let me give a couple examples on the supply chain, and then build that bridge into model provenance. Within the supply chain, as I said, people are using prior art, typically. Open source libraries, packages, frameworks, foundational models on that. But they're also sometimes doing it and tools, as an example, that are a little bit more, as I said, Shadow IT, or not in the purview, if you will of typical security organisations.
I'll give an example, a Jupyter. Notebook. Notebooks in general are used probably 90% of ML projects. They're used at the beginning of the ML development lifecycle, and experimenting and exploring data, testing out some simple models on that side. But there are also many times in live systems with access to super sensitive data. As they install these pip install open source packages, they're then at the mercy, if you will, of the IP licences, as you call it. Ours is permissive licences or they're not. But also what vulnerabilities are within these libraries that are being used. Then, beyond even vulnerabilities, and we'll talk about this a little bit later, from a trusted AI, shall I even use this?
We found some super interesting vulnerabilities within the machine learning development lifecycle and common open source is being used. One is MLflow. MLflow is a super popular open source software that is used in many, many development pipelines for machine learning. MLflow is also inclusive of and model registry.
Just recently, there was a CVE that NIST scored 10 out of 10. Now, why did they score it 10 out of 10? It was complete LFI, RFI from a standpoint of I can get a system access, if I want to, one, steal IP. So, steal the models. Two, I can even do code injections. So, I think, going back to that example when I say a bank is leveraging machine learning in every single time money is being moved around that organisation, if I can go in there in a super simple exploit that lives within the supply chain and get access to code injections, alter and change that model, that is a very high-level risk for these systems. That's just one example of the supply chain. Again, the supply chain is built on, oftentimes not original art, but prior art in the ML environment. That's why it's so, so critical to understand and to be able to assess the risk within it.
[00:25:14] Guy Podjarny: Yes, so understand it. That does sound very much, on many ways, or in many ways, it sounds similar to what might compromise your build system or might compromise your applications because you're using it. I guess, you're saying it is augmented or amplified even more when you consider this developing in production type element. Because if you did, maybe there's a shorter time window between the time that someone might have downloaded or used an untrusted component, and the time in which an attacker could exploit which may be in software, you have, I guess, more opportunities to find it downstream, which puts more emphasis on things like early gates and detections, versus saying, “Fine, maybe someone can download it, as long as I can find it when it's being built. Maybe I'm okay taking the risk on development environments and things like that.”
That's not correct. Fundamental importance, don't forget about it as it is in software, you have to have to have to address this. You have to be mindful of the models that are being consumed because they in turn, they have certain weaknesses, probably, and I guess, we’ll dig more into that. But also remember that where you're developing might amplify that risk even more. You better be honest.
[00:26:25] Ian Swanson: Where you're developing? What you're using? What is that IP? What are those packages? What are these foundational models? The bottom line is I need to understand these things.
[00:26:33] Guy Podjarny: What are the primary tools that one might use for a supply chain? Is it the same old supply chain security, right? Is it Snyk and its elk? What's another gap there that's missing, that those tools will not still satisfy?
[00:26:47] Ian Swanson: I think the main gap is visibility into these systems. Really, what visibility is anchored to is a bill of materials. That's the biggest gap, is we need to build the bridge into these systems, into these pipelines, into this process, so that we can do what we do best in software development. But it needs to first start with, as we are talking about the supply chain, I need to have visibility into it.
Oftentimes, when a machine learning model was built or ready for deployment, and it goes to AppSec, or it goes to some team for penetration testing, it's really just the end model itself. The scoring function, maybe the pickle file, like what's going into production? But there's a whole series of steps in the ML development lifecycle prior to that artefact, that we need to have that bill of materials that visibility into so that we can apply tests. Those tests and those rigor might be as you stated, Synk, and others in the world, it's just building the bridge into this area that is not seen, and giving that visibility across these systems so that we can apply best practices within security.
[00:27:55] Guy Podjarny: Understood. I think, for completion here, and I know we're talking substance, and I appreciate this, but I think Protect AI is one of the tools that might help you a little bit on this domain, right?
[00:28:04] Ian Swanson: Yes, we focus on bill of materials. I think that's the foundational item in terms of where you can build, for example, visibility, auditability, and security on top of.
[00:28:14] Guy Podjarny: Thanks for that. We'll leave it to the listeners to go off and learn more at their interest.
Supply chain, super important, fully understood, and it's just a fun fact, I guess, is when LLaMA was open-sourced by Facebook, the weights were not open-sourced, but they were leaked subsequently, which allowed, of course, this other model to be used, and that was a big deal. In a similar vein, maybe like an example that helps the job for some folks is the weights might be what eventually gets into the deployment models and all of that. At that time, those are not that important. But whatever manipulation, whatever attack or whatever compromised third-party components might have happened, that would have been earlier. It's the weights generation, that is the sensitive part here and where a supply chain attack might have been dramatically impactful, maybe even SolarWinds level. I hope I'm getting it correctly.
[00:29:04] Ian Swanson: Yes, yes. That's where a good pivot to the model provenance. The SolarWinds hack of 2020. That served as a stark reminder of the importance of ensuring transparency, accountability, trustworthiness of software supply chains. As we get into model provenance, the second pillar of ML SecOps, it's really about understanding, yes, bill of materials, but attention of these models. It's understanding who built them, how they built them. So again, going back in the supply chain, and connecting the dots of all the systems.
I've seen large enterprises, I won't state the names that there's typically what's called a model card or a document that somebody approves within the line of business as they are putting a model into production. These documents can be a few pages, to dozens and dozens of pages. But it's an artifact that lives in that single moment of time. As I said, the difference within the machine learning development lifecycle is it's incredibly dynamic, and it changes.
As we think about model provenance, it's yes, understand the bill of materials and the supply chain, but we need to do that in a dynamic way that is constantly keeping attestation, versions, snapshots of the model, so that we can replay these. We can audit these models. We can understand the risks, and we can fix them quickly. Model provenance is a super, super important pillar within ML SecOps.
[00:30:30] Guy Podjarny: Can you give us maybe an example of something done right and something done wrong. What problem might occur if you didn't properly cover model provenance, and maybe to counter that, what practice, I guess, would have helped you avoid that problem.
[00:30:47] Ian Swanson: From a healthcare use case perspective, again, given the name of the company. But we have seen examples where there's even been rogue internal actors working with again, super sensitive healthcare data, that have been able to, as they have access to that, as an ML practitioner, siphon that data off as they build these particular models. But because there is no provenance, there is no attestation, there is no understanding, if you will, from a code level perspective of who created these models, what they're doing with the data, how they're creating these models, and it's only being looked at, at the end result, which is the model itself, is the organisation wasn't able to capture this quickly.
They weren't able to play it back, once they understood that they had an internal rogue actor to see, okay, what did they do? What did they build? What were the ingredients that they had access to? It's not always external threats. In this case, this is an internal threat. It goes back to, again, some of the differences of the ML practice live systems, access to super sensitive data, building incredibly powerful models that are used internally, and we need to build to understand who and how they're building these models, and being informed of how we stop particular risks like this. So, that's a GDPR example, but also tied into this new space of model provenance and understanding all this connective tissue here.
[00:32:14] Guy Podjarny: What's an example of a practice that would catch these types of issues, or that’s probably like too broad to get into detail? But maybe in the example, you said, what practice would have helped prevent that?
[00:32:24] Ian Swanson: I think the easiest thing is that we have to have attestation through every single step of a model build, all the way through from the data we're using, to the model that's being deployed, and understanding all the in between parts of how that model has been tuned, and the software that's been used in creating that model. Only then can we actually audit these systems. If we have the ability to audit these systems, we can create rules, and we can create policies on top of it. Policies that can check for irregular behaviour in this case. As we can check for that irregular behaviour because we have true provenance and understanding of the system, we can stop it, or we can quickly react to it.
[00:33:03] Guy Podjarny: It sounds also akin a little bit to the world of observability on it. It's know if your system is diverging from it. On one hand, just trying to translate this into DevOps terminology, on one hand, you're saying it's very hard to achieve immutability that we might aspire to in software, because the system is naturally a bit more fluid. But you want to still aspire to that a little bit, make sure that when changes do happen, it’s properly attested, so that there's a log of that having happened, and then alongside that, something along the lines of observability, to know that you've digressed and that it's not working the way it used to work, or that it was planned, which is very hard, because the models are very complex. So, defining what normal is, is quite complex. But is that about right, in terms of analogies to operating running ops systems?
[00:33:55] Ian Swanson: Yes, absolutely. I would probably just anchor on this point of, as we think about model provenance, we link it to audit. Can we audit these systems? Do we understand these systems? If supply chain is visibility, we need to build to have auditability. And to have auditability, you have to have model provenance. I'm right now stating that the vast majority of enterprises do not have visibility and auditability. That's a key area of risk as we think about threats on this space.
[00:34:28] Guy Podjarny: Well said. That's probably like a pretty good tee up into your third pillar there of GRC. Tell us a bit more about how that is specific ML SecOps.
[00:34:37] Ian Swanson: Governance, risk, and compliance, the third pillar of ML SecOps. As we think about the differences, again, of software development and ML, GRC for machine learning is not just about rules and terms. It's about use and impact. When you look at new policies, new regulations that are coming within this space, whether it's the EU AI Act, or even the most recent discussions at the White House, people are raising their hands and saying, “We have to be able to have policy to understand the risks within the systems.” If you have the first two pillars, where I can have visibility, I can have auditability of the supply chain, I have provenance. Now, I need to build to go in there. I need to be able to govern these, I need to build to understand the risk of these systems, and I need to take a look at it from a compliance lens, especially tied to new regulations that are coming out.
[00:35:32] Guy Podjarny: So, the assessment of risk, I guess, is a new view on what risk is, right? Risk in the deployed system is mostly around the ease of exploit and the implications of whether someone was repelled for some data, or would maybe be able to enact actions. I'm grossly oversimplifying over here. Does it boil down to the same fundamentals in risk when it comes to AI? Are there other facets of GRC that we need to be mindful of?
[00:36:04] Ian Swanson: I think what's interesting about this space is, and it's kind of anchored by the point of new policies that are being created is this use and impact. We have to understand the end user and the decisions that are being made by these AI systems. If we take a look about what's happening within the White House, which is a blueprint for AI Bill of Rights, there's four key things that has been discussed right now.
Number one, the identification of who trained the algorithm, and who the intended audience is. Number two, the disclosure of the data source. Three, an explanation of how it arrives at its responses. And four, transparent and strong ethical boundaries. That's a little bit different than when we talk about your typical software. It's more anchored on use an impact and the understanding of how it reaches the decisions that it does, to make sure that it's not bias that it's ethical.
So, we have to have the systems to build to measure these to build a govern these, de-risk these, because the penalties could be severe. If you think about the penalties within GDPR, we just recently saw what happened with Meta and fines on that. But there's going to be similar penalties. If you look at the EU AI act that's currently in draft that's looking for final submissions in the next couple of months to build, to put it into practice. Some of those things have been talked about is a percentage of your global revenue is the penalty. So, GRC is rising to the top, at least within the AI space as a critical area that companies, as they look at their investments in AI, they're having to build practices to be able to govern these systems and make sure that they are compliant.
[00:37:43] Guy Podjarny: Understood. I guess, from a practitioner perspective, if I'm a CISO and I indeed have my developers, or there's AI being added, whether I liked it or not into my applications, what would be some key steps that I would need to do to get a hold of the GRC practice, to level up my GRC practice to cover those AI uses?
[00:38:04] Ian Swanson: I think there's two key steps, external and internal. External meaning, understand the new regulations that are coming, have clear definitions, create policies, create trainings based on that. But there's a lot still in flux. These are new. I mean, the AI acts in draft and development over the last couple of years. But what we're seeing right now in the AI Bill of Rights, these are new movements, but they're moving fast. They're moving fast on the tailwinds, if you will, of AI adoption.
So, let's understand what's coming from an external perspective on policy. Internal need to get my house in order. I need to get my house in order from understanding of this can no longer be shadow IT. This can no longer be a model that we just look at from an AppSec perspective. We have to truly understand again, the supply chain, who built it, have a system of record, a bill of materials so that we can connect the dots into the policies that are being created. That housecleaning stuff is just now happening within organizations, and it's going to be a massive effort because it's not just on new models. But it's also models that maybe they deployed a decade ago. They're going to have to go back and say, “How was that built? Who built it? How was it making decisions?” And capture all that logic. So, this is a big housecleaning exercise, but is incredibly important, and the only way that you're going to connect the dots to these external influences that are coming from these policies, these regulations.
[00:39:31] Guy Podjarny: Interesting. It sounds very similar to the security posture management, the whatever Star Security posture management type world, but it has another flair added to it. For starters, you have to know what you've got. Where in your organization is AI being used? Who is running it, and then subsequently you need to go and maybe unlike a cloud security posture management, or even like data security posture management, it's not about just what's there. It's about what arrived. Because what's there is not sufficient. You can't just understand its journey from it.
It's true for software as well. But maybe because of the behavioral aspects of AI, which I guess is an appropriate term. You actually need to dig deeper. But probably, I'd imagine that for most organizations, that first thing of just know what the hell is going on, and where is it, it's probably plenty of work. If you haven't done any of it, that's probably a pretty valuable first step.
[00:40:27] Ian Swanson: Yes. I would add too that there are a few resources out there that can provide this, if you will, map or exercise to connect the dots. Gartner has been pushing forward the AI TRiSM that focuses on model governance, trustworthy, fairness, reliability, robustness, the efficacy, and the data protection. There's a framework that even Gartner has created that is helping their customers and industry as a whole frame the problem and understand how to connect the dots.
Another one was an open-source initiative that I really like called AVID. AVID is understanding vulnerabilities within the system, but it has a tax on them that is incredibly valuable. The AVID taxonomy, as we look at it, focuses on security, ethics, and performance. There's boxes within this taxonomy that now organizations can take back to their teams. Their security teams, their ML engineering teams, and figure out how do we check the boxes across this taxonomy?
The third one is MITRE. We all know MITRE ATT&CK. Well, the Mitre team actually said, there needs to be a new version of attack for AI systems. And MITRE has created then something called ATLAS. And Mitre ATLAS is 100%, focused on the ML development lifecycle, on AI applications, and it's taken a look at it in a framework that we, as people within this space, we understand, because it's layered on a similar framework of attack, where we look at reconnaissance, execution, collection, defense, evasion. So, organizations can use that, again, as a blueprint to say, “Okay, how do I get my own house in order, so that I'm ready from a governance perspective for the regulatory for the policy that is coming.”
[00:42:15] Guy Podjarny: Super useful models on it. Just for those seeking the spelling, AVID is A-V-I-D, avidml.org is the website for it. I’m assuming I've got to the right one. Great models and super useful because AI shapes everything it touches, so many aspects of it. I have to think about it adversarially from multiple different paths. Thanks for sharing those.
Let's continue. We're going much deeper into AI land, gone from three topics that are fairly well familiar to most security practitioners. But I think the last two that are getting a little bit more hairy and a bit more AI-ish on it. Let's move forward. What's the next one on the list?
[00:42:53] Ian Swanson: Yes, the next one on the list is trusted AI, and this is all about ethics, bias, trust, it's understanding, are we making the right decisions? Are these models acting in the appropriate behavior that we think that they should be? Are they not being discriminatory as an example?
Now, this can be a very difficult thing to work with. One of my previous companies, DataScience.com, we had an open source project called Skater and there's been a bunch of open source in the space of how do we do explainability of a model. If we can explain how a model makes decisions, then we can also understand if that model is being fair, or if that model is being biased. I think this is an incredibly important pillar of ML SecOps, but it's also one that is a little bit ambiguous and it's tough to figure it out, because it's about the model performance at an individual, sometimes level, to make sure that it's acting appropriately.
[00:43:48] Guy Podjarny: Do you perceive that and talking to organizations, do you see it being placed under the security organization? Is that where its landing terms of is the sec the importance of it? I think everybody would agree is very high. Is it falling under security?
[00:44:04] Ian Swanson: I think we're actually seeing a lot more involvement in legal teams. The legal within organizations, trying to make sure that they are helping from a brand perspective, that we're not doing anything that we shouldn't be doing. Eventually, as these compliance and regulatory items come out, that we're working with our compliance team has to adhere to it.
But as we get into this trusted AI, it's really about, can we be transparent? Are we able to provide clear explanations for the decisions? That's what we're seeing in some of the White House policy that's being drafted as well. I think there are systems in place where the ML teams, the ML leadership teams, they own it. They work with their GRC and their legal teams to make sure that they have checks in place as part of their CI/CD process, that they are explaining how the model is coming to the decisions that it's making.
[00:44:56] Guy Podjarny: It's interesting to note that companies like TikTok, like Meta, companies in which both the AI and the trust in AI has been a topic of conversation for a good while. Many of them did eventually merge the organization. They did put them onto the same mantle. So, it does seem like industry dynamic has moved it into a domain that has enough overlap, and maybe some of it is indeed to thinking a little bit adversarially, thinking a little bit more about the unintended consequences, which is the domain of security.
[00:45:27] Ian Swanson: I agree with that.
[00:45:28] Guy Podjarny: This is such a massive domain, and probably a topic for many podcasts on its own. But if you were to try and tell someone, where do I even start when it comes to figuring out if your AI can be trusted, if you're doing okay over here, do you have a starting point for people to go to?
[00:45:46] Ian Swanson: Yes, the starting point is fairly simple but yet hard to achieve and that is explainability of a model. When you're building your own model, it's a little bit more attainable. You again, understand the true supply chain. You're using your own training data on it. Where it comes a little bit tricky, is when you work with vendors. Vendors that are supplying you the end AI application that you might just be integrating within your own software stack. On that side, I think the most important thing is asking the right questions. It's asking the right questions of how it was trained, how was it built, and then doing some, effectively, unit tests on the model.
We see other solutions out there that are checking models for a category or something called robustness on that. We're giving it a lot of fuzzy data or we’re giving it outliers within this space, and we're just testing the model, similar to how we would test any other software. We're going through a QA process to say, “Is it making the decisions we think it should? Do we understand how this model is built, the data that it's being used? And can we explain these decisions if we need to, to our end users?” It's building in that framework, building in these practices and these processes internal is the right place to start.
[00:46:57] Guy Podjarny: Yes. You have to start by almost learning the questions on it. I do hope that certain tools come along and try to make it easy. I don't think this will ever truly go away. But at least to understand pitfalls, I think there is a category of AI safety tools. I know of one company that world called lakera.ai, they just had this prop injection tool called Gandalf, kind of make the rounds. Drew a lot of attention. They tried to they say – I do hope there's going to be like a category of companies that try to help you just know a bunch of sharp edges, that you can test your model against, at least.
[00:47:29] Ian Swanson: Yes, this category is actually a little bit more well-defined than the security category of ML. What I mean by that is, over the last five, eight years, there's been a lot of open source that's been shipped and developed. H2O as a company has been a leader as well, in terms of producing open source in the realm of explainability. We see companies like Fiddler that have commercial offerings to test the bias, fairness, explainability of models, and a whole slew of monitoring companies in this space too that have that as an anchoring point of their offering. This is definitely a space that you can not only find open source, but commercial offerings that can help a company get started and monitor these things over time.
[00:48:11] Guy Podjarny: Excellent. Under the mantle of explainability or safety, I guess, is probably the keywords to look for.
[00:48:17] Ian Swanson: Yes, that's absolutely correct.
[00:48:19] Guy Podjarny: We're on to the last end, probably lists name on the list, which is adversarial ML. Tell us a bit more about this fifth pillar of ML SecOps.
[00:48:28] Ian Swanson: Yes. Adversarial machine learning is super interesting. There's been thousands of research papers written just in the past few years. One of the most prominent researchers of adversarial machine learning is Nicholas Carlini. He's been documenting research papers on his blog and the growth of those. Why is this space interesting? What is this?
Adversarial machine learning is effectively the practice, if you will, of can I manipulate a model? Can I fool a model? Can I exfiltrate information from a model or perhaps the model itself, and can I poison a model? As I think about these attacks, these attacks are at the point of inference. If a model is publicly exposed, at this point, it's a peer threat vector, a threat surface that attackers would have access to, to deploy these techniques.
[00:49:19] Guy Podjarny: I guess the biggest challenge, there are a lot of challenges with this world. But one of the biggest challenges would be knowing that it even happened, right? Because a lot of times, you don't fully understand, maybe back to some of the expandability of the system, why a decision was made. But it's quite hard to anticipate that if data was poisoned, not just anticipate, but know that that has occurred. Is that a domain where there are nascent tools? Or does that come back to the provenance?
[00:49:46] Ian Swanson: It is a domain, but I think it first starts with prevention, then detection. Let's talk about the prevention side. I'm going to give an example as it relates to extracting IP, extracting data, or in this case, extracting the model itself. How that particular adversarial machine learning attack will work in theory, is if I have public access to the model, I can hit that model at the point of inference many times, thousands of times, hundreds of thousands of times, throw random features at it, to try to recreate the model to a lookalike model.
Well, I'm talking about prevention. How do I prevent an attack like that? Pretty simple. I could put some throttling on the API in that case. So, adversarial machine learning, I think, is a real category within the security of machine learning systems and AI applications. However, it's not always practical. It's not always generalizable, or scalable. Simple defenses that we put within common practices, like AI endpoints could stop many of these attacks. From a place-to-start perspective, it's making sure that we are doing the basics, if you will, and that we're not skipping any steps on the ML model, not only building, but also in this case, the deployment.
[00:51:09] Guy Podjarny: Great practices there. Would you put like, we managed to talk about AI security for an hour, and I'll say prompt injection when I mentioned Gandalf. Would you put prompt injection into this category?
[00:51:22] Ian Swanson: I absolutely would, and that's where we've actually taken adversarial machine learning and leap forward in terms of its practicality by almost a decade, if you will. So, while there have been thousands and thousands of research paper over the last few years, those research papers are mostly concentrated on white room access, complete access to the model, understanding a model to prove a point that it could be tricked, fooled, stolen, et cetera.
But in the case of large language models, and generative AI, we now have an AI application that's exposed to the user. That starts to really get into the pitfalls of generative AI, and those pitfalls of generative AI can have common links within adversarial machine learning, especially around prompt injection attacks. As we look at prompt injection attacks, it's, “Hey, can I create malicious content? Can I bypass filters?” There's data privacy issues within it. Absolutely, I think, as we go through large language models use cases like ChatGPT and others, we're going to see more of the rise of adversarial risks here, tied to this category of adversarial machine learning at the point of inference.
[00:52:35] Guy Podjarny: Right. We're kind of towards the tail end of the episode, so I won't go too much in-depth. But for those who don't know it, no prompt injection is generally indeed, through now – at least for LLMs and for natural language is, it's the fact that a lot of times the defenses are actually written as these steps of sub-language instructions themselves, or that layered on top, so you can almost like socially engineer the AI defenses or the LLM defenses, so that it would take a different stance, and therefore disclose information and share secret information in an encoded fashion, overcome whatever privacy or ethics or other constraints that have been put on it, and it's a fast world. It might be worthy of an episode of its own, as we talk because it is growing and blurry thanks to the growing popularity of LLMs, in general.
Adversarial ML is really the new domain. It's less theoretical for things that are exposed to users and for many AI systems. We've talked about medical and all that. It takes quite an effort for someone to come in and manipulate the data. It's enough scale. But when you're talking about public facing applications, when you're talking about browsing data, or getting pulled into or LLM interactions, when you're talking about anything in which there's more readily available, I guess we've seen that in all manipulations, for instance of Google's autocomplete, and things like that, where people would manipulate the data to try and make the model produce a certain output. I guess that would be the red team domain. This is probably where a lot of Red Teams are living at the moment.
[00:54:08] Ian Swanson: Yes. We're seeing a rise of Red Team Tools within the space. Red Team Tools for adversarial machine learning, so there's a lot of great open-source packages to learn more there. For example, Microsoft has one called Counterfeit. IBM has one called Adversarial Robustness Toolbox. In short, it's called ART. That's on the adversarial side. Then, even on the supply chain side, I brought up an example of the MLflow LFI, RFI exploit. There's open source tools to be able to scan for that. See if you're susceptible to exploit. One of them is called Snaike. It's spelled S-N-A-I-K-E, a little play there, if you will, of AI. There's a rise, if you will, of AI ML Red Team, all within large enterprises, and we're seeing more and more of those functions come to fruition, if you will, and being built just even in the last six months.
[00:55:00] Guy Podjarny: This is one of the longer episodes already, and I think we're barely scratching the surface of this brave new world, which clearly would be on a lot of our minds. Maybe as we conclude here, I'd like to ask for a couple of future predictions. One a bit more concrete than one, maybe casting your eyes further out.
The first one is just maybe a little bit on the state of audit and where you see that happening. You've referred to audits a variety of times, what are your expectations in terms of what will happen with auditing use of AI? How quickly should people be prepared about it? What do you think the auditors would focus on early on? What's your sense there? Then, we'll go to a more future-looking question.
[00:55:38] Ian Swanson: As we're seeing these policies come to fruition in the EU here, in the States, it's going to be a forcing function around audit. What we're seeing on the practice of audit is some of the big auditing firms, consulting firms are starting their early days of building a practice to audit AI decisions and systems, just like they're auditing financial data and financial outcomes. So, this is going to be a forcing function for these pillars, if you will, of ML SecOps, to build to understand from a place of how these models were built, how and why they're making the decisions that they're making, that true attestation and understanding of the supply chain in the form of a bill of materials. All those things are going to be super critical to build to add here and comply with the regulations that are coming in, and the third-party auditing that is going to start to grow as an industry. We're already seeing a lot of small shops and the big ones are coming in and it's 100% going to be a thing, as it relates to regulated industries. That's something to watch out for, just in the next, I would call it, 12 to even 18 months, we're going to be seeing announcements in that space.
[00:56:48] Guy Podjarny: That sounds very likely given the level of interest in the space. I guess that comes back, for you to be prepared for that, comes back to your recommendations around GRC, and getting your act together and knowing what's where. I think that would put you in a much better position to answer audits and pass them. I guess one last question before we let you move on here, where do you think all of this is headed? If you roll forward the next three to five years of our dealing with AI security? Is this a mess that we're going to be in for a decade where we're going to be chasing our tail and not really succeeding? It’s this an insurmountable challenge? Do you think it's more of a year of scrambling, and then maybe we're in a slightly more stable or as stable as we are within a new tech-type solution? How do you see things play out over the next few years?
[00:57:35] Ian Swanson: It's going to be a year of learning and understanding. Here's what I mean by that. Learning about what's different in these ML systems, in these AI applications. Understanding and how do we connect the dots to systems and processes and practices that we've already adopted within a large enterprise. There's a massive solution ecosystem around security, and I'm not saying that we absolutely have to create new things in that. But we need to connect the dots in terms of what might be standards within security that we see.
OWASP standards, as we look at access control, secure design, configuration, there are ML equivalents. We need to understand the differences, we need to build those bridges, and we need to just put the best practices in place and policies that we understand in typical software development, but perhaps are not applied or slightly unique or nuanced in the machine learning development lifecycle.
Now, I will make a prediction here and this might be unfortunate, I do think there's going to be a seminal moment within AI here shortly. There will be a Log4j, SolarWinds moment in this space, and the risk is high. If again, machine learning and AI applications sit at the heart of a company and is used throughout every single aspect of the organization, that is a massive opportunity and a massive threat vector that we need to take serious, and we need to protect it commensurate to its value.
[00:59:06] Guy Podjarny: Unfortunately, I agree there's going to be some watershed moment that lights a few more light bulbs, for the ones that haven't done it. Hopefully, the ones listening over here will be prepared and it will be well organized.
Ian, thanks again for coming on to the show and helping all of us organize what the ML SecOps domain is all about. I think everybody's got a bit of homework to do. So, thanks for sharing the views here.
[00:59:28] Ian Swanson: Thank you very much.
[00:59:29] Guy Podjarny: Thanks everybody, for tuning in. I hope you found this interesting and that you'll join us for the next one.
[OUTRO]
[00:59:38] ANNOUNCER: Thanks for listening to The Secure Developer. That's all we have time for today. To find additional episodes and full transcriptions, visit thesecuredeveloper.com. If you'd like to be a guest on the show, or get involved in the community, find us on Twitter at @DevSecCon. Don't forget to leave us a review on iTunes if you enjoyed today's episode.
Bye for now.
[END]