AI Glossary

Comprehensive list of AI terminology for developers and security teams

著者

0 分で読めます

As artificial intelligence (AI) becomes deeply embedded across the software development lifecycle (SDLC), understanding the AI terminology and its concepts is mandatory for anyone working in cybersecurity, technology, or software development.

This glossary breaks down the essential terms and technologies at the intersection of AI and cybersecurity, from foundational concepts and development tools to emerging threats and risk mitigation practices. Whether you're a security leader, developer, or simply curious about the impact of AI, you can use this AI glossary as your quick-reference guide to navigating the complexities and nuances of AI.

Foundational AI concepts

As AI continues to evolve, so does the variety of tools built on it, from specialized applications to multifunctional platforms. Despite their differences, most AI tools rely on a shared foundation of core methodologies, including machine learning (ML), natural language processing (NLP), neural networks, and computer vision. When combined, these techniques allow AI systems to solve increasingly complex tasks with advanced capabilities.

Artificial intelligence (AI) - Computational systems that tackle complex, uncertain problems using statistical modeling, pattern recognition, and adaptive learning. Unlike traditional software, AI doesn’t follow fixed algorithms; it learns and adapts, enabling it to perform tasks that typically require human intelligence, like learning, problem-solving, and decision-making.

Agentic AI - AI systems made up of individual AI agents that can operate autonomously, making multi-step decisions to achieve goals with minimal human supervision. Agentic AI operates proactively using a dynamic process of perception, reasoning, action, and learning, continuously improving through feedback loops. In many security use cases, humans typically serve in a supervisory “human-in-the-loop” role to oversee actions taken by AI agents.

AI agent - A system, typically built on a large language model (LLM), that autonomously performs user-defined tasks by designing its own workflows, using external tools to gather and analyse data, make decisions, take action, and iteratively improve through continuous feedback loops.

AI inference vs. AI training vs. RAG - AI training builds the foundation of AI models by teaching them patterns from large datasets. AI inference is the process of generating outputs based on the user prompts used during the AI training. Retrieval-Augmented Generation (RAG) increases the accuracy of inferred outputs with relevant external data.

AI model - A mathematical model of algorithms and parameters, based on specific AI methodologies like machine learning, that analyzes data, recognizes patterns, and then uses this information to perform a specific task. Examples of AI models include GPT-4, Gemini, and Palm.

Chatbot - AI-powered software that simulates conversations with users. It uses natural language processing to 'understand' text or speech inputs and generative AI to respond to the user input prompts with relevant outputs.

Data mining - The process of extracting patterns, insights, trends, and other useful information from large datasets with statistics, machine learning, and database system techniques. Data mining is also known as knowledge discovery in data (KDD).

Deep learning - A subset of machine learning that uses multi-layered neural networks to process unstructured data through hierarchical representations. It automatically extracts features at different levels of abstraction to progressively refine outputs with increasing accuracy.

ETL (extract, transform, load) - A fundamental component of data pipelines, ETL is a data integration process used to pull data from diverse sources, clean and transform the data into a usable format, and then make it available through a data repository. This helps ensure AI models have access to high-quality data.

Fine-tuning - The process of adapting a pre-trained model to perform better on specific tasks or domains by providing specialized data and adjusting parameters. Fine-tuning requires significantly less time and computational resources than retraining or training from scratch.

Generative AI - AI that creates new content, such as images, audio, and text, based on learned patterns from training data. Rather than analyzing existing data, Generative AI creates new outputs that fit the original data context.

“Human-in-the-loop” - This term refers to human involvement in machine learning processes and AI workflows. Human experts provide insights, expertise, and judgment to refine and validate AI outputs, enhancing the accuracy, responsibility, and adaptability of AI systems.

Hybrid AI - An approach that combines multiple AI approaches to overcome the limitations of individual techniques. For example, Snyk Code uses hybrid AI by generating code fix suggestions with generative AI and then checking the security level of these suggestions with symbolic AI.

Large Language Model (LLM) - A subset of machine learning AI that is a type of artificial neural network pre-trained on massive, text-based datasets, allowing it to perform different NLP tasks, such as recognizing, translating, predicting, or generating human-like text or other content, and to ‘understand’ language to some extent.

Machine learning (ML) - A field of AI that uses statistical modeling and pattern recognition to learn from data without explicit programming. Machine learning powers the content-generation capabilities of generative AI and the problem-solving and decision-making abilities of agentic AI and AI agents.

Natural language processing (NLP) - A subset of AI that combines rule-based human language modeling with statistical analysis, machine learning, and deep learning models to allow computers to “understand” and create human-like language.

Neural network - A computational model that mimics the functionality of the human brain. Neural networks can learn patterns and relationships in data.

Non-deterministic behavior - Refers to systems, such as LLMs, that can respond unpredictably to identical initial conditions and inputs. The lack of “normal” or predictable responses and outputs makes it challenging to apply security controls based on application behavior, forcing AppSec teams to rely on other attributes as well as tools that can adapt to LLM's unique characteristics.

Prompt engineering - Text inputs or queries for guiding generative AI models towards a desired outcome. Prompts prime the model and give it context that is related to the desired response.

Retrieval-Augmented Generation (RAG) vs Cache-Augmented Generation (CAG) - RAG and CAG are two different approaches to integrating external knowledge into LLMs. RAG is a method that improves outputs from LLMs and generative AI applications with relevant external data outside of their training databases. CAG is a method that enhances LLMs by preloading relevant data into their context or memory so the AI model can answer quickly from its cache.

Training data - A set of examples used to train machine learning algorithms, helping them to learn patterns and make predictions or decisions.

AI trust, governance, and ethics

Trust, good governance, and ethical conduct are crucial for responsible AI development. Ensuring that AI systems operate transparently, fairly, and in alignment with human values serves not only to protect users and stakeholders but to build the public confidence necessary for widespread AI adoption.

AI ethics - Ethical principles for developing and using AI systems while minimizing potential harms. AI ethics aims to minimize issues of algorithmic bias and fairness, privacy, accountability, transparency, explainability, environmental impacts, and more.

AI trust - Refers to the confidence to adopt and integrate AI systems into development workflows. It stems from a clear understanding of AI risks, transparent governance, operational visibility, and appropriate security controls.

Bias and fairness - Bias in AI occurs when models reflect and reinforce unfair patterns found in training data. Fairness involves identifying and mitigating these biases to ensure AI-driven outcomes are equitable and non-discriminatory.

Explainability - The ability to explain how and why an AI model makes certain predictions or decisions. The explainability of an AI model is essential for building trust and transparency.

GRC (Governance, Risk, and Compliance) - Any organizational framework and policies that guide the management of IT operations in accordance with business goals, industry standards, and regulatory guidelines. Effective GRC ensures that AI development remains secure, compliant, and strategically aligned.

Development terms and use cases

AI has many uses throughout the software development lifecycle (SDLC) and IT in general, from writing code to automating repetitive tasks. When implemented responsibly, AI can significantly speed up development while optimizing output quality, creating a virtuous cycle of continuous improvement.

AI-assisted development - The use of generative AI in software development for writing, testing, reviewing, and documenting code. AI-assisted development can automate and accelerate many tasks, but human oversight is critical for maintaining code quality and security.

AI Bill of Materials (AIBOM) - A comprehensive inventory of all components of an AI system, including data sources, models, software dependencies, and licenses, that serves as a central source of truth for transparency and security assessments.

AI code generation - The use of AI-powered tools to automatically write, complete, or improve code based on natural language prompts or partial code inputs. AI code generators can help increase developer productivity and democratize coding, but AI-generated code must be reviewed carefully for security breaches and inaccuracies.

Command-line Interface (CLI) - A text-based interface where users interact with operating systems and software by typing commands into a terminal, with a shell program that interprets these commands for execution.

Infrastructure as Code (IaC) - The process of provisioning and managing resources through code rather than manual configuration. It enables automation and version control similar to software development.

Integrated Development Environment (IDE) - A unified program that combines multiple software development tools into a single interface to streamline development processes.

Internal Developer Portal (IDP) – A centralized web-based platform that provides self-serve access to tools, resources, and documentation for software development.

LangChain - An open source framework that simplifies building applications and AI agents on LLMs by providing standardized interfaces to connect models with external data and APIs without significant code changes.

Large Language Model Operations (LLMOps) - A set of tools, principles, and best practices for managing the development lifecycle for LLM-powered applications, from data preparation and model fine-tuning to monitoring and governance.

Model Context Prompt (MCP) - An open standard introduced by Anthropic in late 2024 that standardizes how AI models connect with external data sources and tools, enabling greater interoperability between LLMs and external systems.

Machine Learning Operations (MLOps) - The practice of applying DevOps principles to streamline the machine learning lifecycle (including for GenAI) from deployment to maintenance. It bridges the gap between ML development and production operations, ensuring that models are robust, scalable, and aligned with business goals.

Vibe coding - A new approach to software development where developers create code through AI-powered platforms simply by describing desired outcomes in natural language, without requiring manual coding or prior programming knowledge.

AI application vulnerabilities and threats

Just like any software application, AI applications and models are vulnerable to a wide range of cyberattacks. However, unlike traditional software, AI vulnerabilities can arise from issues with models and learning processes, in addition to data and misconfigurations. These vulnerabilities can be exploited to trick AI models into violating their ethical guardrails, leading to significant impacts on organizations, software supply chains, and developer ecosystems.

AI jailbreak - Manipulating an LLM to bypass its ethical guardrails, either through prompt engineering (classic jailbreak) or by feeding harmful content through external data sources (indirect prompt injection).

Data exfiltration - The unauthorized transfer of data from an organization’s secure environment to an external location controlled by a malicious actor. Also known as data theft or data extrusion.

Data poisoning - A cyberattack strategy that undermines AI model performance by deliberately corrupting datasets through injecting malicious data, mislabeling, or other means.

Hallucinations - The generation of coherent but nonsensical or inaccurate information that is created from patterns or objects that the LLM perceives in the model’s training data. It could be thought of as the AI "imagining" information, creating content with no factual basis or grounding in the learning it received. Hallucinations are a security concern, making the AI model more unpredictable and unreliable.

Model theft - AI models, especially those built from scratch or fine-tuned based on company data, become intellectual property and doorways into proprietary data. Model theft is the stealing of models through techniques such as extracting weights and parameters (used to adjust how the model learns) or query overloading (used to extract patterns and outputs) to gain insights and potentially reverse-engineer the model.

Package hallucination - Hallucinations that refer to a library, module, or other type of package that doesn’t exist. In a package hallucination attack, malware bearing a hallucinated filename is uploaded to a public repository and then downloaded by unsuspecting developers.

Prompt injection - A cyberattack against LLMs that disguises malicious inputs as legitimate prompts with the intention of influencing or directing the AI system to misbehave, leak sensitive data, or deliver misinformation.

AI in cybersecurity

AI represents a double-edged sword in cybersecurity, simultaneously strengthening defensive postures while arming adversaries with more sophisticated and robust attack capabilities. The AI arms race between security teams and threat actors calls for constant innovation and vigilance to uphold cyber defenses for AI models and applications against continuously evolving cyberattacks.

AI-powered attacks - Refers to cyberattacks that take advantage of AI capabilities such as data gathering, automation, hyper-personalized targeting, and content generation to amplify social engineering techniques and ransomware attacks, among other tactics.

AI red teaming - Cybersecurity attack simulations led by an adversarial “red team” of ethical hackers to test AI systems for vulnerabilities and weaknesses that may produce misleading, biased, or harmful outputs.

AI Security Posture Management (AI SPM) - An emerging practice that provides visibility into all AI systems to secure training data and pipelines, monitor and triage risks, and ensure security integration throughout the SDLC.

Dynamic Application Security Testing (DAST) - A security testing method that identifies vulnerabilities in running applications by simulating real-world external attacks without access to source code.

Data Security Posture Management (DSPM) - The practice of maintaining a comprehensive, unified view of all data assets in use across an organization to assess and mitigate risks.

Static Application Security Testing (SAST) - A security testing method that examines an application’s source code at rest to identify vulnerabilities early in the SDLC.

Software Composition Analysis (SCA) - A security assessment methodology that inspects applications for open source and third-party components, comparing them against known vulnerability databases and creating a software bill of materials (SBOM).

AI によって自動で生成されたコードの保護を始める

無料の Snyk アカウントを作成して、今すぐ AI によって自動で生成されたコードの保護を始めましょう。また、専門家によるデモを予約して、Snyk が開発者セキュリティのユースケースにどのように適用されるのかをご覧ください。

無料で始める資料請求

In summary

As with any technology, AI can be helpful or harmful — depending on how the technology is used. To get the most out of AI, learn how each tool ingests data, processes it, and creates outputs. By understanding what each type of AI can and cannot do, organizations can make the best and most secure decisions for their tech stacks and move forward with confidence.

開発者セキュリティプラットフォーム

試してみませんか？