AI Glossary
The definitive dictionary for AI, Machine Learning, and Governance terminology. From Flash Attention to RAG — look up any term.
A
A/B Testing
A controlled experiment comparing two versions (A and B) of a system, feature, or model to determine which performs better. Users are randomly assigned to each version and outcomes are measured.
Accountability
The principle that there must be clear responsibility and liability for AI system decisions and their outcomes. Someone must be answerable when AI causes harm.
Accuracy
The percentage of correct predictions out of all predictions made by a model. While intuitive, accuracy can be misleading for imbalanced datasets.
Activation Function
A mathematical function applied to the output of each neuron in a neural network that introduces non-linearity. Without activation functions, a neural network would just be a series of linear transformations.
Active Learning
A training strategy where the model identifies the most informative unlabeled examples and requests human labels only for those. This minimizes labeling effort by focusing on the examples that matter most.
Adam Optimizer
An adaptive optimization algorithm that combines momentum and adaptive learning rates for each parameter. Adam maintains running averages of both gradients and squared gradients.
Adversarial Attack
An input deliberately crafted to fool an AI model into making incorrect predictions. Adversarial examples often look normal to humans but cause models to fail spectacularly.
Adversarial Training
A defense technique where adversarial examples are included in the training data to make the model more robust against attacks. The model learns to handle both normal and adversarial inputs.
Agent Memory
Systems that give AI agents persistent storage for facts, preferences, and conversation history across sessions. Memory enables agents to build cumulative knowledge over time.
Agentic AI
AI systems designed to operate with high autonomy — planning, executing, and adapting without constant human oversight. Agentic AI emphasizes independent action-taking to accomplish user goals.
Agentic Memory Systems
Architectures for managing different types of memory in AI agents — working memory for current tasks, episodic memory for past interactions, and semantic memory for accumulated knowledge.
Agentic RAG
An advanced RAG pattern where an AI agent dynamically decides what to retrieve, how to refine queries, and when to search again based on the quality of initial results.
Agentic Workflow
A multi-step process where an AI agent autonomously plans, executes, evaluates, and iterates on tasks, making decisions at each step rather than following a fixed pipeline.
AI Agent
An AI system that can autonomously plan, reason, and take actions to accomplish goals. Unlike simple chatbots, agents can use tools, make decisions, execute multi-step workflows, and adapt their approach based on results.
AI Alignment Tax
The performance cost of making AI models safer and more aligned with human values. Safety training sometimes reduces raw capability on certain tasks.
AI Chip
A semiconductor designed specifically for artificial intelligence workloads, optimized for the mathematical operations (matrix multiplication, convolution) that neural networks require.
AI Coding Assistant
An AI tool that helps developers write, debug, review, and refactor code through natural language interaction and code completion. Modern coding assistants use LLMs fine-tuned on code.
AI Democratization
Making AI technology accessible to a broader range of people and organizations, regardless of technical expertise or resources. Includes open-source models, no-code tools, and affordable APIs.
AI Ethics
The study of moral principles and values that should guide the development and deployment of AI systems. It addresses questions of fairness, accountability, transparency, privacy, and the societal impact of AI.
AI Governance
The frameworks, policies, processes, and organizational structures that guide the responsible development, deployment, and monitoring of AI systems within organizations and across society.
AI Literacy
The ability to understand, evaluate, and effectively use AI systems. AI literacy includes knowing what AI can and cannot do, how it works at a conceptual level, and how to critically assess AI outputs.
AI Maturity Model
A framework that describes the stages of an organization's AI capability, from initial experimentation through scaled deployment to AI-driven transformation.
AI Memory
Systems that give AI models the ability to retain and recall information across conversations or sessions. Memory enables persistent context, user preferences, and accumulated knowledge.
AI Orchestration Layer
The middleware that coordinates AI model calls, tool execution, memory management, and error handling in complex AI applications. It manages the flow between components.
AI Product Management
The discipline of managing AI-powered products, which requires understanding both traditional product management and the unique characteristics of AI systems (uncertainty, data dependency, continuous learning).
AI Regulation
Government rules and legislation governing the development, deployment, and use of artificial intelligence. AI regulation is rapidly evolving worldwide.
AI Risk Management
The systematic process of identifying, assessing, mitigating, and monitoring risks associated with AI systems. NIST's AI Risk Management Framework provides a comprehensive approach.
AI Safety
The research field focused on ensuring AI systems operate reliably, predictably, and without causing unintended harm. It spans from technical robustness to long-term existential risk concerns.
AI Supply Chain
The end-to-end ecosystem of components needed to build and deploy AI, from chip manufacturing and cloud infrastructure through data, models, tools, and applications.
AI Transformation
The comprehensive organizational change process of integrating AI across business functions, processes, and strategy. It goes beyond individual AI projects to fundamentally rethink how work gets done.
Alignment
The challenge of ensuring AI systems behave in ways that match human values, intentions, and expectations. Alignment aims to make AI helpful, honest, and harmless.
Annotation
The process of adding labels, tags, or metadata to raw data to make it suitable for supervised machine learning. Annotation can involve labeling images, transcribing audio, or tagging text.
Anomaly Detection
Techniques for identifying data points, events, or observations that deviate significantly from expected patterns. Anomalies can indicate fraud, equipment failure, security breaches, or other important events.
Anthropic
An AI safety company founded by former OpenAI researchers, focused on building safe and beneficial AI. Anthropic developed Claude and pioneered Constitutional AI.
API
Application Programming Interface — a set of rules and protocols that allow different software applications to communicate with each other. In AI, APIs let developers integrate AI capabilities into their applications.
Approximate Nearest Neighbor
An algorithm that finds vectors approximately closest to a query vector, trading perfect accuracy for dramatic speed improvements. ANN makes vector search practical at scale.
Artificial General Intelligence
A hypothetical AI system with human-level cognitive abilities across all domains — able to reason, learn, plan, and understand any intellectual task that a human can. AGI does not yet exist.
Artificial Intelligence
The broad field of computer science focused on creating systems capable of performing tasks that typically require human intelligence. This includes learning, reasoning, problem-solving, perception, and language understanding.
Artificial Superintelligence
A theoretical AI system that vastly surpasses human intelligence across all domains including creativity, problem-solving, and social intelligence. ASI remains purely hypothetical.
ASIC
Application-Specific Integrated Circuit — a chip designed for a single specific purpose. In AI, ASICs like Google's TPUs are designed exclusively for neural network operations.
Attention Head
A single attention computation within multi-head attention. Each head independently computes attention scores, allowing different heads to specialize in different types of relationships.
Attention Map
A visualization showing which parts of the input an AI model focuses on when making predictions. Attention maps reveal the model's internal focus patterns.
Attention Mechanism
A component in neural networks that allows the model to focus on the most relevant parts of the input when producing each part of the output. It assigns different weights to different input elements based on their relevance.
Attention Score
The numerical value representing how much one token should focus on another token in the attention mechanism. Higher scores mean stronger relationships between tokens.
Attention Sink
A phenomenon in transformers where the first few tokens in a sequence receive disproportionately high attention scores regardless of their content, acting as 'sinks' for excess attention.
Attention Window
The range of tokens that an attention mechanism can attend to in a single computation. Different attention patterns (local, global, sliding) use different window sizes.
Audit
A systematic examination of an AI system's data, algorithms, processes, and outcomes to verify compliance, fairness, accuracy, and adherence to stated principles.
Autoencoder
A neural network that learns to compress data into a lower-dimensional representation (encoding) and then reconstruct it back (decoding). It learns what features are most important for faithful reconstruction.
AutoML
Automated Machine Learning — tools and techniques that automate the end-to-end process of applying machine learning, including feature engineering, model selection, and hyperparameter tuning.
Autonomous Agent Framework
A software framework providing the infrastructure for building AI agents including planning, memory, tool integration, error handling, and multi-agent coordination.
Autonomous AI
AI systems capable of making decisions and taking actions independently without continuous human guidance. Autonomous AI can plan, execute, and adapt to changing circumstances on its own.
Autonomous Vehicle
A vehicle that can navigate and operate without human input using AI systems for perception (cameras, lidar), decision-making, and control. Self-driving technology uses computer vision, sensor fusion, and planning.
B
Backdoor Attack
A type of data poisoning where a model is trained to behave maliciously when a specific trigger pattern is present in the input, while behaving normally otherwise.
Backpropagation
The primary algorithm used to train neural networks. It calculates how much each weight in the network contributed to the error, then adjusts weights backward from the output layer to reduce future errors.
Batch Normalization
A technique that normalizes the inputs to each layer in a neural network by adjusting and scaling them to have zero mean and unit variance. This stabilizes and accelerates the training process.
Batch Size
The number of training examples processed together before the model updates its parameters. Batch size affects training speed, memory usage, and how smoothly the model learns.
Bayesian Optimization
A sequential optimization strategy for finding the best hyperparameters by building a probabilistic model of the objective function and using it to select the most promising configurations to evaluate.
Beam Search
A search algorithm used in text generation that explores multiple possible output sequences simultaneously, keeping the top-scoring candidates at each step. It finds higher-quality outputs than greedy decoding.
Benchmark
A standardized test or dataset used to evaluate and compare the performance of AI models. Benchmarks provide consistent metrics that allow fair comparisons between different approaches.
Benchmark Contamination
When a model's training data inadvertently includes test data from benchmarks, leading to inflated performance scores that do not reflect true capability.
BERT
Bidirectional Encoder Representations from Transformers — a language model developed by Google that reads text in both directions simultaneously. BERT excels at understanding language rather than generating it.
Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.
Bias in AI
Systematic errors in AI outputs that unfairly favor or disadvantage certain groups based on characteristics like race, gender, age, or socioeconomic status. Bias can originate from training data, model design, or deployment context.
Bias-Variance Tradeoff
The fundamental tension in ML between a model that is too simple (high bias, underfitting) and one that is too complex (high variance, overfitting). The goal is finding the sweet spot.
Black Box
A model or system whose internal workings are not visible or understandable to the user — you can see the inputs and outputs but not the reasoning in between. Most deep learning models are considered black boxes.
BM25
Best Matching 25 — a widely used ranking function for keyword-based information retrieval. BM25 scores documents based on query term frequency, document length, and corpus statistics.
Byte-Pair Encoding
A subword tokenization algorithm that starts with individual characters and iteratively merges the most frequent pairs to create a vocabulary of subword units. It balances vocabulary size with handling of rare words.
C
Capability Elicitation
Techniques for discovering and activating latent capabilities in AI models — abilities that exist but are not obvious from standard testing or usage.
Catastrophic Forgetting
The tendency of neural networks to completely forget previously learned information when trained on new data or tasks. New learning overwrites old knowledge.
Catastrophic Interference
When learning new information in a neural network severely disrupts previously learned knowledge. It is the underlying mechanism behind catastrophic forgetting.
Catastrophic Risk
The potential for AI systems to cause large-scale, irreversible harm to society. This includes risks from misuse (bioweapons), accidents (autonomous systems), and structural disruption (mass unemployment).
CatBoost
A gradient boosting library by Yandex that handles categorical features natively without requiring manual encoding. CatBoost also addresses prediction shift and target leakage.
Causal Inference
Statistical methods for determining cause-and-effect relationships from data, going beyond correlation to understand whether X actually causes Y.
Causal Language Model
A training approach where the model predicts the next token given only the preceding tokens (left-to-right). This is how GPT models are trained and is the basis for text generation.
Chain-of-Thought
A prompting technique where the model is encouraged to show its step-by-step reasoning process before arriving at a final answer. This improves accuracy on complex reasoning tasks.
Chatbot
An AI application designed to simulate conversation with human users through text or voice. Modern chatbots use LLMs to provide natural, contextually aware responses.
ChatGPT
OpenAI's consumer-facing AI chatbot powered by GPT models. ChatGPT brought LLMs to the mainstream when it launched in November 2022, reaching 100 million users in two months.
Chinchilla Scaling
Research by DeepMind showing that many LLMs were significantly undertrained — for a given compute budget, training a smaller model on more data yields better performance.
Chunking
The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.
CI/CD for ML
Continuous Integration and Continuous Deployment applied to machine learning — automating the testing, validation, and deployment of ML models whenever code or data changes.
Citizen Data Scientist
A business professional who creates ML models and analytics using no-code or low-code tools, without formal data science training. They bridge the gap between business and technical teams.
Classification
A type of supervised learning task where the model predicts which category or class an input belongs to. The output is a discrete label rather than a continuous value.
Claude
Anthropic's family of AI assistants known for their focus on safety, helpfulness, and honesty. Claude models are designed with Constitutional AI principles for safer, more reliable AI interactions.
CLIP
Contrastive Language-Image Pre-training — an OpenAI model trained to understand the relationship between images and text. CLIP can match images to text descriptions without being trained on specific image categories.
Closed Source AI
AI models where the architecture, weights, and training details are proprietary and not publicly available. Users access them only through APIs or products controlled by the developer.
Cloud Computing
On-demand access to computing resources (servers, storage, databases, AI services) over the internet. Cloud providers like AWS, Azure, and GCP offer scalable infrastructure without owning physical hardware.
Clustering
An unsupervised learning technique that groups similar data points together based on their characteristics, without predefined labels. The algorithm discovers natural groupings in the data.
Code Generation
The AI capability of producing functional source code from natural language descriptions, specifications, or partial code. Modern LLMs can generate code in dozens of programming languages.
Cognitive Architecture
A framework or blueprint for building AI systems that mimics aspects of human cognition, including perception, memory, reasoning, learning, and action.
Cold Start Problem
The challenge of making recommendations for new users (who have no history) or new items (which have no ratings). Cold start is a fundamental difficulty in recommendation systems.
Collaborative Filtering
A recommendation technique that predicts a user's interests based on the preferences of similar users. It assumes people who agreed in the past will agree again in the future.
Compliance
The process of ensuring AI systems meet regulatory requirements, industry standards, and organizational policies. AI compliance is becoming increasingly complex as regulations proliferate.
Compute
The computational resources (processing power, memory, time) required to train or run AI models. Compute is measured in FLOPs (floating-point operations) and is a primary constraint and cost in AI development.
Compute-Optimal Training
Allocating a fixed compute budget optimally between model size and training data quantity, based on scaling law research like the Chinchilla findings.
Computer Vision
A field of AI that trains computers to interpret and understand visual information from the world — images, videos, and real-time camera feeds. It enables machines to 'see' and make decisions based on what they see.
Concept Bottleneck
A model architecture that forces predictions through a set of human-interpretable concepts. The model first predicts concepts, then uses those concepts to make the final prediction.
Concept Drift
A change in the underlying relationship between inputs and outputs over time. Unlike data drift, concept drift means the rules of the game have changed, not just the distribution of inputs.
Confidence Score
A numerical value (typically 0-1) indicating how certain a model is about its prediction. Higher scores indicate greater confidence in the output.
Confusion Matrix
A table that summarizes the performance of a classification model by showing true positives, true negatives, false positives, and false negatives. It reveals the types of errors a model makes.
Confusion Matrix Metrics
The set of performance metrics derived from the confusion matrix including true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
Constitutional AI
An alignment approach developed by Anthropic where AI models are guided by a set of principles (a 'constitution') that help them self-evaluate and improve their responses without relying solely on human feedback.
Constitutional AI Principles
The specific set of rules and values embedded in a Constitutional AI system that guide its self-evaluation and response generation. These principles define what 'good' behavior means.
Constrained Generation
Techniques that force LLM output to conform to specific formats, schemas, or grammars. This ensures outputs are always valid JSON, SQL, or match a defined structure.
Constraint Satisfaction
The problem of finding values for variables that satisfy a set of constraints. In AI, it is used in scheduling, planning, and configuration tasks.
Content Moderation
The process of monitoring and filtering user-generated or AI-generated content to ensure it meets platform guidelines and legal requirements. AI is increasingly used to automate content moderation.
Content-Based Filtering
A recommendation technique that suggests items similar to those a user has previously liked, based on the items' features and attributes rather than other users' behavior.
Context Distillation
A technique where the behavior of a model prompted with detailed instructions is distilled into a model that exhibits the same behavior without the instructions.
Context Management
Strategies for efficiently using an LLM's limited context window, including what information to include, how to compress it, and when to summarize or truncate.
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. It includes both the input prompt and the generated output. Larger context windows allow models to handle longer documents.
Contextual Bandits
An extension of multi-armed bandits where the agent observes context (features) before making a decision, enabling personalized choices based on the current situation.
Continual Learning
Training a model on new data or tasks over time without forgetting previously learned knowledge. Also called lifelong learning or incremental learning.
Continual Pre-Training
Extending a pre-trained model's training on new domain-specific data without starting from scratch. It adapts the model to a new domain while preserving general capabilities.
Continuous Batching
A serving technique where new requests are added to an in-progress batch as existing requests complete, maximizing GPU utilization rather than waiting for an entire batch to finish.
Contrastive Learning
A self-supervised technique where the model learns by comparing similar (positive) and dissimilar (negative) pairs of examples. It learns representations where similar items are close and different items are far apart.
Conversational AI
AI technology that enables natural, multi-turn conversations between humans and machines. It combines NLU, dialog management, and NLG to maintain coherent, contextual interactions.
Convolutional Neural Network
A type of neural network specifically designed for processing grid-like data such as images. CNNs use convolutional layers that apply filters to detect patterns like edges, textures, and shapes at different scales.
Cosine Similarity
A metric that measures the similarity between two vectors by calculating the cosine of the angle between them. Values range from -1 (opposite) to 1 (identical), with 0 meaning unrelated.
Counterfactual Explanation
An explanation of an AI decision that describes what would need to change in the input for the model to produce a different output. It answers 'What if?' questions about predictions.
Cross-Encoder
A model that takes two texts as input simultaneously and outputs a relevance or similarity score. Unlike bi-encoders, cross-encoders consider the full interaction between both texts.
Cross-Entropy
A loss function commonly used in classification tasks that measures the difference between the predicted probability distribution and the actual distribution. Lower cross-entropy means better predictions.
Cross-Validation
A model evaluation technique that splits data into multiple folds, trains on some folds and tests on the held-out fold, repeating so every fold serves as the test set. It provides a robust estimate of model performance.
Crowdsourcing
Using a large group of distributed workers (often through platforms like Amazon Mechanical Turk or Scale AI) to perform data annotation and labeling tasks.
CUDA
Compute Unified Device Architecture — NVIDIA's parallel computing platform that enables GPU programming for AI workloads. CUDA is the dominant software ecosystem for AI computation.
Curriculum Learning
A training strategy inspired by human education where the model is exposed to training examples in a meaningful order — starting with easier examples and gradually increasing difficulty.
D
DALL-E
A text-to-image AI model created by OpenAI that generates original images from text descriptions. DALL-E can create realistic images, art, and conceptual visualizations from natural language prompts.
Data Annotation Pipeline
An end-to-end workflow for producing labeled training data, from task design through annotator training, quality assurance, and delivery of labeled datasets.
Data Augmentation
Techniques for artificially expanding a training dataset by creating modified versions of existing data. This helps models generalize better, especially when training data is limited.
Data Drift
A change in the statistical properties of the input data over time compared to the data the model was trained on. When data drifts, model predictions become less reliable.
Data Engineering
The practice of designing, building, and maintaining the systems and infrastructure that collect, store, and prepare data for analysis and machine learning.
Data Governance
The overall management of data availability, usability, integrity, and security in an organization. It includes policies, standards, and practices for how data is collected, stored, and used.
Data Labeling
The process of assigning meaningful tags or annotations to raw data so it can be used for supervised learning. Labels tell the model what the correct answer should be for each training example.
Data Lake
A centralized repository that stores vast amounts of raw data in its native format until needed. Data lakes accept structured, semi-structured, and unstructured data at any scale.
Data Lineage
The tracking of data's origins, transformations, and movements throughout its lifecycle. Data lineage answers the question 'Where did this data come from and what happened to it?'
Data Mesh
A decentralized approach to data architecture where domain teams own and manage their own data as products, rather than centralizing all data in a single warehouse or lake.
Data Parallelism
A distributed training approach where the training data is split across multiple GPUs, each holding a complete copy of the model. Gradients are averaged across GPUs after each batch.
Data Pipeline
An automated workflow that extracts data from sources, transforms it through processing steps, and loads it into a destination for use. In ML, data pipelines ensure consistent data flow from raw sources to model training.
Data Poisoning
A security attack where malicious data is injected into a training dataset to corrupt the model's behavior. Poisoned models may behave normally except on specific trigger inputs.
Data Preprocessing
The process of cleaning, transforming, and organizing raw data into a format suitable for machine learning. This includes handling missing values, encoding categories, scaling features, and removing outliers.
Data Privacy
The right of individuals to control how their personal information is collected, used, stored, and shared. In AI, data privacy concerns arise from training data, user interactions, and model outputs.
Data Quality
The degree to which data is accurate, complete, consistent, timely, and fit for its intended use. Data quality directly impacts the reliability and performance of AI models.
Data Warehouse
A structured, organized repository of cleaned and processed data optimized for analysis and reporting. Unlike data lakes, data warehouses store data in defined schemas.
Decision Tree
A supervised learning algorithm that makes predictions by learning a series of if-then-else decision rules from the data. It creates a tree-like structure where each internal node tests a feature and each leaf provides a prediction.
Deep Fake
AI-generated media (especially video and audio) that convincingly depicts real people saying or doing things they never actually said or did. Created using deep learning techniques.
Deep Learning
A specialized subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to learn complex patterns in data. Deep learning excels at tasks like image recognition, speech processing, and natural language understanding.
Denoising
The process of removing noise from data to recover the underlying clean signal. In generative AI, denoising is the core mechanism of diffusion models.
Dense Retrieval
Information retrieval using learned vector embeddings to find semantically similar documents. Called 'dense' because document representations are dense numerical vectors with no zero values.
Deployment
The process of making a trained ML model available for use in production applications. Deployment involves packaging the model, setting up serving infrastructure, and establishing monitoring.
Deterministic Output
When an AI model produces the same output every time for the same input. Achieved by setting temperature to 0 and using fixed random seeds.
Differential Privacy
A mathematical framework that provides provable privacy guarantees when analyzing or learning from data. It ensures that the output of any analysis is approximately the same whether or not any individual's data is included.
Diffusion Model
A type of generative AI model that creates data by starting with random noise and gradually removing it, step by step, until a coherent output (like an image) emerges. This process is called denoising.
Digital Twin
A virtual replica of a physical system, process, or object that uses real-time data and AI to simulate, predict, and optimize the behavior of its physical counterpart.
Dimensionality Reduction
Techniques that reduce the number of features (dimensions) in a dataset while preserving the most important information. This makes data easier to visualize, speeds up training, and can improve model performance.
Distributed Training
Splitting model training across multiple GPUs or machines to handle larger models or datasets and reduce training time. Techniques include data parallelism and model parallelism.
Document Processing
AI-powered extraction and understanding of information from documents including PDFs, images, forms, and scanned papers. It combines OCR, NLP, and computer vision.
DPO
Direct Preference Optimization — a simpler alternative to RLHF that directly optimizes a language model from human preference data without needing a separate reward model. It is more stable and easier to implement.
Dropout
A regularization technique where random neurons are temporarily disabled (dropped out) during each training step. This forces the network to not rely too heavily on any single neuron and builds redundancy.
Dual Use
Technology or research that can be applied for both beneficial and harmful purposes. Most AI capabilities are inherently dual-use, creating governance challenges.
E
Early Stopping
A regularization technique where training is halted when the model's performance on validation data stops improving, even if training loss continues to decrease. It prevents overfitting by finding the optimal training duration.
Edge Inference
Running AI models directly on local devices (phones, IoT sensors, cameras) rather than sending data to the cloud. This reduces latency, preserves privacy, and works without internet connectivity.
Elastic Weight Consolidation
A technique for continual learning that identifies which weights are important for previously learned tasks and penalizes changes to those weights during new learning.
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Embedding Dimension
The number of numerical values in a vector embedding. Higher dimensions can capture more nuanced relationships but require more storage and computation.
Embedding Drift
Changes in embedding vector distributions over time as the underlying data, vocabulary, or user behavior shifts. Drift degrades retrieval quality in RAG and search systems.
Embedding Fine-Tuning
Adapting a pre-trained embedding model to a specific domain or task by further training it on domain-specific data, improving retrieval quality for specialized applications.
Embedding Model
A specialized model designed to convert text, images, or other data into vector embeddings. Embedding models are optimized for producing meaningful numerical representations rather than generating text.
Embedding Space
The high-dimensional geometric space in which embeddings exist. In this space, the distance and direction between points encode semantic relationships between the items they represent.
Embeddings as a Service
Cloud APIs that convert text or other data into vector embeddings without requiring users to host or manage embedding models themselves.
Emergent Behavior
Capabilities that appear in large AI models that were not explicitly trained for and were not present in smaller versions. Emergent abilities seem to appear suddenly at certain scale thresholds.
Encoder-Decoder
An architecture where the encoder compresses input into a fixed representation and the decoder generates output from that representation. This structure is used in translation, summarization, and image captioning.
Ensemble Learning
A strategy that combines multiple models to produce better predictions than any single model alone. Ensemble methods leverage the diversity of different models to reduce errors.
Epoch
One complete pass through the entire training dataset during model training. Models typically require multiple epochs to learn effectively, with each pass refining the model's understanding.
Ethical AI
AI development practices that explicitly consider moral implications, societal impact, and human values throughout the design, development, and deployment lifecycle.
Ethical Hacking of AI
The practice of systematically testing AI systems for vulnerabilities, biases, and failure modes with the goal of improving safety and robustness before malicious actors find the same weaknesses.
ETL
Extract, Transform, Load — a data integration process that extracts data from source systems, transforms it into a usable format, and loads it into a destination system.
EU AI Act
The European Union's comprehensive regulatory framework for artificial intelligence, establishing rules based on risk levels. It categorizes AI systems from minimal to unacceptable risk with corresponding compliance requirements.
Evaluation
The systematic process of measuring an AI model's performance, safety, and reliability using various metrics, benchmarks, and testing methodologies.
Evaluation Framework
A structured system for measuring AI model performance across multiple dimensions including accuracy, safety, fairness, robustness, and user satisfaction.
Evaluation Harness
A standardized testing framework for running AI models through suites of benchmarks and evaluation tasks. It ensures consistent, reproducible evaluation across models.
Existential Risk
The risk that advanced AI systems could pose a threat to the long-term survival or flourishing of humanity. This is the most serious concern in the AI safety research community.
Expert System
An early AI system that mimics human expertise in a specific domain using a knowledge base of rules and facts. Expert systems were the dominant AI approach in the 1980s.
Explainability
The ability to understand and articulate how an AI model reaches its decisions or predictions. Explainable AI (XAI) makes the decision-making process transparent and comprehensible to humans.
Explainable AI
The subfield focused on making AI decision-making processes understandable to humans. XAI techniques provide insights into why a model made a specific prediction.
Exploding Gradient
A training problem where gradients become extremely large during backpropagation, causing weight updates to be so drastic that the model becomes unstable and training diverges.
Exploration vs Exploitation
The fundamental tradeoff in reinforcement learning between trying new actions (exploration) to discover potentially better strategies and using known good actions (exploitation) to maximize current reward.
F
F1 Score
The harmonic mean of precision and recall, providing a single metric that balances both. F1 scores range from 0 to 1, with 1 being perfect precision and recall.
Fairness
The principle that AI systems should treat all individuals and groups equitably and not produce discriminatory outcomes. Multiple mathematical definitions of fairness exist, and they can sometimes conflict.
Feature Engineering
The process of selecting, transforming, and creating input variables (features) from raw data to improve model performance. It requires domain knowledge to identify what information is most useful for the model.
Feature Store
A centralized repository for storing, managing, and serving machine learning features. It ensures consistent feature computation between training and serving, and enables feature reuse across teams.
Federated Analytics
Techniques for computing analytics and insights across distributed datasets without moving or centralizing the raw data. Each participant computes locally and only shares aggregated results.
Federated Inference
Running AI model inference across multiple distributed devices or locations, rather than centralizing it in one place. Each device processes its own data locally.
Federated Learning
A decentralized training approach where a model is trained across multiple devices or organizations without sharing raw data. Each participant trains locally and only shares model updates.
Few-Shot Learning
A technique where a model learns to perform a task from only a few examples provided in the prompt. Instead of training on thousands of examples, the model generalizes from just 2-5 demonstrations.
Few-Shot Prompting
A prompt engineering technique where a small number of input-output examples are provided before the actual query, demonstrating the desired format and behavior to the model.
Fine-Tuning
The process of taking a pre-trained model and further training it on a smaller, domain-specific dataset to specialize its behavior for a particular task or domain. Fine-tuning adjusts the model's weights to improve performance on the target task.
Fine-Tuning vs RAG
The strategic decision between customizing a model's weights (fine-tuning) or providing external knowledge at inference time (RAG). Each approach has different strengths and use cases.
Flash Attention
An optimized implementation of the attention mechanism that reduces memory usage and increases speed by tiling the computation and avoiding materializing the full attention matrix in memory.
FLOPS
Floating Point Operations Per Second — a measure of computing speed that quantifies how many mathematical calculations a processor can perform each second. Used to measure AI hardware performance.
Foundation Model
A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Foundation models serve as the base upon which specialized applications are built.
Frontier Model
The most capable and advanced AI models available at any given time, typically characterized by the highest performance across multiple benchmarks. These models push the boundaries of AI capabilities.
Function Calling
A capability where an LLM can generate structured output to invoke specific functions or APIs. The model decides which function to call and what parameters to pass based on the user's request.
G
GDPR
General Data Protection Regulation — the European Union's comprehensive data protection law that gives individuals control over their personal data and imposes strict obligations on organizations handling that data.
Gemini
Google DeepMind's family of multimodal AI models designed to understand and generate text, code, images, audio, and video. Gemini is Google's flagship AI model series.
Generalization
A model's ability to perform well on new, unseen data that was not part of its training set. Generalization is the ultimate goal of machine learning — learning patterns, not memorizing examples.
Generative Adversarial Network
A framework where two neural networks compete — a generator creates fake data and a discriminator tries to tell real from fake. This adversarial process drives both networks to improve, producing increasingly realistic outputs.
Generative AI
AI systems that can create new content — text, images, music, code, video — rather than just analyzing or classifying existing data. These models learn patterns from training data and generate novel outputs that resemble the original data.
GGUF
A file format for storing quantized language models designed for efficient CPU inference. GGUF is the standard format used by llama.cpp and is popular for local LLM deployment.
Google DeepMind
Google's AI research division, formed by merging Google Brain and DeepMind in 2023. Known for AlphaGo, AlphaFold, and the Gemini model family.
GPT
Generative Pre-trained Transformer — a family of large language models developed by OpenAI. GPT models are trained to predict the next token in a sequence and can generate coherent, contextually relevant text across many tasks.
GPU
Graphics Processing Unit — originally designed for rendering graphics, GPUs excel at the parallel mathematical operations needed for training and running AI models. They are the primary hardware for modern AI.
Gradient Accumulation
A technique that simulates larger batch sizes by accumulating gradients over multiple forward passes before performing a single weight update. This enables large effective batch sizes on limited hardware.
Gradient Boosting
An ensemble technique that builds models sequentially, where each new model focuses on correcting the errors made by previous models. It combines many weak learners into a single strong learner.
Gradient Clipping
A technique that caps gradient values at a maximum threshold during training to prevent exploding gradients. If a gradient exceeds the threshold, it is scaled down.
Gradient Descent
An optimization algorithm used to minimize the error (loss) of a model by iteratively adjusting parameters in the direction that reduces the loss most quickly. It is the primary method for training machine learning models.
Graph Neural Network
A type of neural network designed to operate on graph-structured data (nodes and edges). GNNs learn representations of nodes, edges, or entire graphs by aggregating information from neighbors.
GraphRAG
A RAG approach that uses knowledge graphs rather than vector databases for retrieval. It combines graph traversal with LLM generation to answer questions requiring multi-hop reasoning.
Greedy Decoding
A simple text generation strategy where the model always selects the most probable next token at each step. It is fast but can produce repetitive or suboptimal outputs.
Grounding
The practice of connecting AI model outputs to verifiable sources of information, ensuring responses are based on factual data rather than the model's potentially unreliable internal knowledge.
GRU
Gated Recurrent Unit — a simplified version of LSTM that uses fewer gates and parameters while achieving similar performance on many sequence tasks. It is faster to train than LSTM.
Guardrail Model
A separate, specialized AI model that monitors the inputs and outputs of a primary LLM to detect and block harmful, off-topic, or policy-violating content.
Guardrails
Safety mechanisms and constraints built into AI systems to prevent harmful, inappropriate, or off-topic outputs. Guardrails can operate at the prompt, model, or output level.
H
Hallucination
When an AI model generates information that sounds plausible and confident but is factually incorrect, fabricated, or not grounded in its training data or provided context. The model essentially 'makes things up'.
Hallucination Detection
Methods and systems for automatically identifying when an AI model has generated false or unsupported information. Detection can compare outputs against source documents or use consistency checks.
Hallucination Rate
The frequency at which an AI model generates incorrect or fabricated information. It is typically measured as a percentage of responses containing hallucinations.
Hardware Acceleration
Using specialized hardware (GPUs, TPUs, FPGAs, ASICs) to speed up AI computation compared to general-purpose CPUs. Accelerators are optimized for the specific math operations used in neural networks.
Homomorphic Encryption
A form of encryption that allows computation on encrypted data without decrypting it first. The results, when decrypted, match what would have been computed on the plaintext.
Hugging Face
The leading open-source platform for sharing and discovering AI models, datasets, and applications. Hugging Face hosts the Transformers library and a community hub with thousands of pre-trained models.
Human Evaluation
Using human judges to assess AI model quality on subjective dimensions like helpfulness, coherence, creativity, and safety that automated metrics cannot fully capture.
Human-in-the-Loop
A system design where humans are integrated into the AI workflow to provide oversight, make decisions, correct errors, or handle edge cases that the AI cannot reliably manage alone.
Hybrid Search
A search approach that combines keyword-based (lexical) search with semantic (vector) search to get the benefits of both — exact matching for specific terms and meaning-based matching for conceptual queries.
Hyperparameter
Settings that are configured before training begins and control how the model learns, as opposed to parameters which are learned during training. Examples include learning rate, batch size, and number of layers.
Hyperparameter Tuning
The process of systematically searching for the best combination of hyperparameters for a model. Since hyperparameters are set before training, finding optimal values requires experimentation.
I
Image Classification
A computer vision task that assigns a category label to an entire image. The model determines what the main subject of the image is from a predefined set of categories.
Image Segmentation
A computer vision task that assigns a label to every pixel in an image, dividing it into meaningful regions. It identifies not just what objects are present but their exact shapes and boundaries.
Impact Assessment
A systematic evaluation of the potential effects an AI system may have on individuals, groups, and society. Impact assessments consider both positive outcomes and potential harms.
In-Context Learning
An LLM's ability to learn new tasks from examples or instructions provided within the prompt, without any weight updates or fine-tuning. The model adapts its behavior based on the context given.
Incident Response for AI
Procedures for identifying, containing, and resolving failures or harmful behaviors in deployed AI systems. AI incident response adapts traditional IT incident management for AI-specific challenges.
Inference
The process of using a trained model to make predictions on new, previously unseen data. Inference is what happens when an AI model is deployed and actively serving results to users.
Inference Optimization
Techniques for making AI model inference faster, cheaper, and more efficient. This includes quantization, batching, caching, speculative decoding, and hardware optimization.
Information Extraction
The task of automatically extracting structured information (entities, relationships, events) from unstructured text documents.
Instruction Dataset
A curated collection of instruction-response pairs used to train or fine-tune models to follow human instructions. The quality and diversity of this dataset directly shapes model behavior.
Instruction Following
An LLM's ability to accurately understand and execute user instructions, including complex multi-step directives with specific constraints on format, tone, length, and content.
Instruction Hierarchy
A framework for prioritizing different levels of instructions when they conflict — system prompts typically override user prompts, which override context from retrieved documents.
Instruction Tuning
A fine-tuning approach where a model is trained on a dataset of instruction-response pairs, teaching it to follow human instructions accurately. This transforms a text-completion model into a helpful assistant.
Instructor Embedding
An embedding approach where you provide instructions that describe the task alongside the text, producing task-specific embeddings from a single model.
Interpretability
The degree to which a human can understand the internal mechanisms and reasoning process of a machine learning model. More interpretable models allow deeper inspection of how they work.
K
K-Means
A clustering algorithm that partitions data into K groups by iteratively assigning each data point to the nearest cluster center and then recalculating the centers. K must be specified in advance.
Knowledge Base
A structured or semi-structured collection of information used by AI systems to retrieve factual data. In the context of RAG, it typically refers to the document collection that the system can search.
Knowledge Cutoff
The date after which a language model has no training data. The model cannot reliably answer questions about events that occurred after its knowledge cutoff.
Knowledge Distillation
A model compression technique where a smaller 'student' model is trained to mimic the behavior of a larger 'teacher' model. The student learns not just correct answers but the teacher's nuanced probability distributions.
Knowledge Graph
A structured representation of real-world entities and the relationships between them, stored as a network of nodes (entities) and edges (relationships). Knowledge graphs capture factual information in a machine-readable format.
KV Cache
Key-Value Cache — a mechanism that stores previously computed attention key and value vectors during autoregressive generation, avoiding redundant computation for tokens already processed.
L
Labeling Platform
Software tools that manage the process of data annotation at scale, including task distribution, quality control, annotator management, and labeling interfaces.
LangChain
A popular open-source framework for building applications powered by language models. It provides tools for prompt management, chains, agents, memory, and integration with external tools and data sources.
Large Language Model
A type of AI model trained on massive amounts of text data that can understand and generate human-like text. LLMs use transformer architecture and typically have billions of parameters, enabling them to perform a wide range of language tasks.
Latency
The time delay between sending a request to an AI model and receiving the response. In ML systems, latency includes data preprocessing, model inference, and network transmission time.
Latent Space
A compressed, lower-dimensional representation of data learned by a model. Points in latent space capture the essential features of the data, and nearby points represent similar data items.
Layer Normalization
A normalization technique that normalizes the inputs across the features for each individual example (rather than across the batch). It stabilizes training in transformers and RNNs.
Leaderboard
A ranking of AI models by performance on specific benchmarks. Leaderboards drive competition and provide quick comparisons but can encourage gaming and narrow optimization.
Learning Rate
A hyperparameter that controls how much the model's weights are adjusted in response to errors during each training step. It determines the size of the steps taken during gradient descent optimization.
LightGBM
Light Gradient Boosting Machine — Microsoft's gradient boosting framework optimized for speed and efficiency. LightGBM uses histogram-based splitting and leaf-wise growth for faster training.
LIME
Local Interpretable Model-agnostic Explanations — a technique that explains individual predictions by approximating the complex model locally with a simple, interpretable model.
Linear Regression
The simplest regression algorithm that models the relationship between input features and a continuous output as a straight line (or hyperplane in multiple dimensions). It minimizes the sum of squared errors.
Llama
A family of open-weight large language models released by Meta. Llama models are available for download and customization, making them the most widely adopted open-source LLM family.
LLM-as-Judge
Using a large language model to evaluate the quality of another model's outputs, replacing or supplementing human evaluators. The judge LLM scores responses on various quality dimensions.
Logistic Regression
A classification algorithm that uses the sigmoid function to predict the probability of a binary outcome. Despite its name containing 'regression,' it is used for classification tasks.
Long Context
The ability of AI models to process very large amounts of input text — typically 100K tokens or more — enabling analysis of entire books, codebases, or document collections.
Long Short-Term Memory
A type of recurrent neural network designed to learn long-term dependencies through special gating mechanisms that control information flow. LSTMs address the vanishing gradient problem of standard RNNs.
LoRA
Low-Rank Adaptation — a parameter-efficient fine-tuning technique that freezes the original model weights and adds small trainable matrices to each layer. It dramatically reduces the compute and memory needed for fine-tuning.
Loss Function
A mathematical function that measures how far a model's predictions are from the actual correct values. The goal of training is to minimize this loss function, making predictions as accurate as possible.
Low-Code AI
AI development platforms that require minimal coding, typically providing visual interfaces with optional code customization for more advanced users.
M
Machine Learning
A subset of AI where systems learn patterns from data and improve their performance over time without being explicitly programmed for every scenario. ML algorithms build mathematical models from training data to make predictions or decisions.
Machine Translation
The use of AI to automatically translate text or speech from one language to another. Modern neural machine translation uses transformer models and achieves near-human quality for many language pairs.
Masked Language Model
A training approach where random tokens in the input are replaced with a special [MASK] token and the model learns to predict the original tokens from context. This is how BERT was pre-trained.
Meta-Learning
An approach where models 'learn to learn' — they are trained across many tasks so they can quickly adapt to new tasks with minimal data. Also called learning to learn.
Minimum Viable AI
The simplest AI solution that delivers enough value to validate a use case. It prioritizes fast learning over comprehensive features, following lean startup principles.
Mistral
A French AI company and their family of efficient, high-performance open-weight language models. Mistral models are known for strong performance relative to their size.
Mixed Precision Training
Training neural networks using a combination of 16-bit and 32-bit floating-point numbers to speed up computation and reduce memory usage while maintaining model accuracy.
Mixture of Agents
An architecture where multiple different AI models collaborate on a task, with each model contributing its strengths. A routing or aggregation layer combines their outputs.
Mixture of Depths
A transformer architecture where different tokens use different numbers of layers, allowing the model to spend more computation on complex tokens and less on simple ones.
Mixture of Experts
An architecture where a model consists of multiple specialized sub-networks (experts) and a gating mechanism that routes each input to only the most relevant experts. Only a fraction of the total parameters are active per input.
Mixture of Modalities
AI architectures that natively process and generate multiple data types within a single unified model, rather than using separate models connected together.
MLOps
Machine Learning Operations — the set of practices that combine ML, DevOps, and data engineering to deploy and maintain ML models in production reliably and efficiently.
Model Card
A standardized document that accompanies a machine learning model, describing its intended use, performance metrics, limitations, training data, ethical considerations, and potential biases.
Model Collapse
A phenomenon where AI models trained on AI-generated content progressively lose quality and diversity, eventually producing repetitive, low-quality outputs. Each generation of model degrades further.
Model Context Protocol
An open protocol that standardizes how AI models connect to external tools, data sources, and services. MCP provides a universal interface for LLMs to access context from any compatible system.
Model Distillation Pipeline
An end-to-end workflow for transferring knowledge from a large teacher model to a smaller student model, including data generation, training, evaluation, and deployment.
Model Drift
The gradual degradation of a model's predictive performance over time as the real-world environment changes. Model drift can be caused by data drift, concept drift, or both.
Model Evaluation Pipeline
An automated system that runs a comprehensive suite of evaluations on AI models, generating reports on accuracy, safety, bias, robustness, and other quality dimensions.
Model Governance
The policies, processes, and tools for managing AI models throughout their lifecycle — from development through deployment to retirement. It ensures models remain compliant, fair, and performant.
Model Hub
A platform for hosting, discovering, and sharing pre-trained AI models. Model hubs provide standardized access to thousands of models across different tasks and architectures.
Model Interpretability Tool
Software tools that help understand how ML models make predictions, including feature importance, attention visualization, counterfactual explanations, and decision path analysis.
Model Merging
Combining the weights of multiple fine-tuned models into a single model that inherits capabilities from all source models, without additional training.
Model Monitoring
The practice of continuously tracking an ML model's performance, predictions, and input data in production to detect degradation, drift, or anomalies after deployment.
Model Parallelism
A distributed training approach where the model itself is split across multiple GPUs, with each GPU holding and computing a different portion of the model.
Model Registry
A centralized repository for storing, versioning, and managing trained ML models along with their metadata (metrics, parameters, lineage). It serves as the system of record for models.
Model Serving
The infrastructure and process of deploying trained ML models to production where they can receive requests and return predictions in real time. It includes scaling, load balancing, and version management.
Model Size
The number of parameters in a model, typically expressed in millions (M) or billions (B). Model size correlates loosely with capability but also determines compute and memory requirements.
Model Weights
The collection of all learned parameter values in a neural network. Model weights are what you download when you get a pre-trained model — they encode everything the model learned.
Momentum
An optimization technique that accelerates gradient descent by accumulating a velocity vector in the direction of persistent gradients, helping overcome local minima and noisy gradients.
Multi-Agent System
An architecture where multiple AI agents collaborate, each with specialized roles or capabilities, to accomplish complex tasks that no single agent could handle alone.
Multi-Armed Bandit
A simplified reinforcement learning problem where an agent must choose between multiple options (arms) with unknown payoffs, balancing exploration of new options with exploitation of known good ones.
Multi-Head Attention
An extension of attention where multiple attention mechanisms (heads) run in parallel, each learning to focus on different types of relationships in the data. The outputs are then combined.
Multilingual AI
AI models capable of understanding and generating text in multiple languages. Modern LLMs often support 50-100+ languages, though performance varies significantly across languages.
Multimodal AI
AI systems that can process and generate multiple types of data — text, images, audio, video — within a single model. Multimodal models understand the relationships between different data types.
Multimodal Embedding
Embeddings that map different data types (text, images, audio) into the same vector space, enabling cross-modal search and comparison.
Multimodal RAG
Retrieval-augmented generation that works across multiple data types — retrieving and reasoning over text, images, tables, and charts to answer questions that require multimodal understanding.
Multimodal Search
Search systems that can query across different data types — finding images with text, videos with audio descriptions, or documents that contain specific visual elements.
N
Named Entity Recognition
The NLP task of identifying and classifying named entities in text into predefined categories such as person names, organizations, locations, dates, monetary values, and more.
Narrow AI
AI systems designed and trained for a specific task or narrow set of tasks. All current AI systems are narrow AI — they excel in their domain but cannot generalize outside it.
Natural Language Generation
The AI capability of producing human-readable text from structured data, internal representations, or prompts. NLG is the output side of language AI — turning machine understanding into human words.
Natural Language Inference
The NLP task of determining the logical relationship between two sentences — whether one entails, contradicts, or is neutral with respect to the other.
Natural Language Processing
The branch of AI that deals with the interaction between computers and human language. NLP enables machines to read, understand, generate, and make sense of human language in a useful way.
Natural Language Understanding
The ability of an AI system to comprehend the meaning, intent, and context of human language, going beyond surface-level word matching to grasp semantics, pragmatics, and implied meaning.
Neural Architecture Search
An automated technique for finding optimal neural network architectures by searching through a vast space of possible designs. NAS automates architecture decisions that normally require expert intuition.
Neural Network
A computing system inspired by the biological neural networks in the human brain. It consists of interconnected nodes (neurons) organized in layers that process information and learn to recognize patterns.
Neuro-Symbolic AI
Approaches that combine neural networks (pattern recognition, learning from data) with symbolic AI (logical reasoning, knowledge representation) to get the strengths of both.
No-Code AI
AI platforms that allow users to build, train, and deploy machine learning models without writing any code, using visual interfaces and drag-and-drop tools.
Noise
Random variation or errors in data that do not represent true underlying patterns. In deep learning, noise can also refer to the random input used in generative models.
O
Object Detection
A computer vision task that identifies and locates specific objects within an image or video, providing both the object class and its position (usually as a bounding box).
Observability
The ability to understand the internal state and behavior of an AI system through its external outputs, including logging, tracing, and monitoring of LLM calls and agent actions.
Online Learning
A training paradigm where the model updates continuously as new data arrives, one example at a time (or in small batches), rather than training on a fixed dataset.
ONNX
Open Neural Network Exchange — an open format for representing machine learning models that enables interoperability between different ML frameworks and deployment targets.
Ontology
A formal representation of knowledge within a domain that defines concepts, categories, properties, and the relationships between them. It provides a shared vocabulary and structure for organizing information.
Open Source AI
AI models and tools released with open licenses that allow anyone to use, modify, and distribute them. Open-source AI democratizes access and enables community-driven improvement.
OpenAI
The AI research company that created GPT, ChatGPT, DALL-E, and Whisper. Originally founded as a nonprofit in 2015, OpenAI became the most prominent AI company after launching ChatGPT.
Optical Character Recognition
Technology that converts images of text (typed, handwritten, or printed) into machine-readable digital text. Modern OCR uses deep learning for high accuracy even on difficult inputs.
Orchestration
The coordination and management of multiple AI components, tools, and services to accomplish complex workflows. Orchestration handles routing, sequencing, error handling, and resource allocation.
Overfitting
When a model learns the training data too well — including its noise and random fluctuations — and performs poorly on new, unseen data. The model essentially memorizes rather than generalizes.
Overfitting Prevention
The collection of techniques used to ensure a model generalizes well to unseen data rather than memorizing training examples. Includes regularization, dropout, early stopping, and data augmentation.
P
Parallel Computing
Processing multiple computations simultaneously rather than sequentially. Parallel computing is fundamental to AI training and inference, which involve massive matrix operations.
Parallel Function Calling
The ability of an LLM to invoke multiple tool calls simultaneously in a single response, rather than sequentially. This enables faster task completion for independent operations.
Parameter
Any learnable value in a machine learning model that is adjusted during training. Parameters include weights and biases in neural networks. Model size is often described by parameter count.
Perceptron
The simplest form of a neural network — a single neuron that takes weighted inputs, sums them, and applies an activation function to produce an output. It is the fundamental building block of neural networks.
Perplexity
A metric that measures how well a language model predicts text. Lower perplexity indicates the model is less 'surprised' by the text, meaning it can predict the next token more accurately.
Pinecone
A managed vector database service designed for AI applications. Pinecone handles the infrastructure complexity of storing, indexing, and querying high-dimensional vectors at scale.
Planning
An AI agent's ability to break down complex goals into a sequence of steps and determine the best order of actions to accomplish a task. Planning involves reasoning about dependencies, priorities, and contingencies.
Playground
An interactive web interface where users can experiment with AI models by adjusting parameters, testing prompts, and seeing results in real time without writing code.
Positional Encoding
A technique used in transformers to inject information about the position of each token in a sequence. Since transformers process all tokens in parallel, they need explicit position information.
Pre-training
The initial phase of training a model on a large, general-purpose dataset before specializing it for specific tasks. Pre-training gives the model broad knowledge and capabilities.
Precision
Of all the items the model predicted as positive, the proportion that were actually positive. Precision measures how trustworthy the model's positive predictions are.
Preference Optimization
Training techniques that directly optimize models based on human preference data, where humans indicate which of two model outputs they prefer.
Principal Component Analysis
A dimensionality reduction technique that transforms data into a new coordinate system where the first axis captures the most variance, the second axis the next most, and so on.
Privacy-Preserving ML
Machine learning techniques that train models or make predictions while protecting the privacy of individual data points. Includes federated learning, differential privacy, and homomorphic encryption.
Prompt Attack Surface
The total set of potential vulnerabilities in an LLM application that can be exploited through prompt-based attacks, including injection, leaking, and jailbreaking vectors.
Prompt Caching
A technique that stores and reuses the processed form of frequently used prompt prefixes, avoiding redundant computation. It speeds up inference and reduces costs for repeated prompts.
Prompt Chaining
A technique where the output of one LLM call becomes the input for the next, creating a pipeline of prompts that together accomplish a complex task.
Prompt Compression
Techniques for reducing the token count of prompts while preserving their essential meaning, enabling more efficient use of context windows and reducing API costs.
Prompt Engineering
The practice of designing and optimizing input prompts to get the best possible output from AI models. It involves crafting instructions, providing examples, and structuring queries to guide the model toward desired responses.
Prompt Injection
A security vulnerability where malicious input is crafted to override or manipulate an LLM's system prompt or instructions, causing it to behave in unintended ways.
Prompt Injection Defense
Techniques and strategies for protecting LLM applications from prompt injection attacks, including input sanitization, output filtering, and architectural defenses.
Prompt Leaking
When a user successfully extracts a system's hidden system prompt through clever questioning. Prompt leaking reveals proprietary instructions, business logic, and safety configurations.
Prompt Library
A curated collection of tested, optimized prompts organized by use case. Prompt libraries accelerate development by providing proven starting points for common tasks.
Prompt Management
The practice of versioning, testing, and managing prompts used in LLM applications. It treats prompts as code that needs proper lifecycle management.
Prompt Optimization
Systematic techniques for improving prompt effectiveness, including automated prompt search, A/B testing of prompt variants, and iterative refinement based on output quality metrics.
Prompt Template
A pre-defined structure for formatting prompts to AI models, with placeholders for dynamic content. Templates ensure consistent, optimized prompt formatting across applications.
Prompt Tuning
A parameter-efficient fine-tuning technique that prepends learnable 'soft prompt' tokens to the input while keeping the main model weights frozen. Only the soft prompt parameters are trained.
Prompt Versioning
Tracking different versions of prompts over time, including changes, performance metrics, and rollback capabilities. Essential for managing prompts in production AI applications.
Pruning
A model compression technique that removes unnecessary or redundant weights, neurons, or layers from a trained neural network. Like pruning a plant, it removes parts that are not contributing to overall health.
Q
QLoRA
Quantized Low-Rank Adaptation — combines LoRA with quantization to further reduce memory requirements for fine-tuning. It quantizes the base model to 4-bit precision while training LoRA adapters in higher precision.
Quantization
The process of reducing the precision of a model's numerical weights (e.g., from 32-bit to 8-bit or 4-bit), making the model smaller and faster while accepting a small trade-off in accuracy.
Quantization-Aware Training
Training a model while simulating the effects of quantization, so the model learns to maintain accuracy even when weights are later reduced to lower precision.
Question Answering
An NLP task where the model provides direct answers to questions, either from a given context passage (extractive QA) or from general knowledge (open-domain QA).
R
RAG Pipeline
The complete end-to-end system for retrieval-augmented generation, including document ingestion, chunking, embedding, indexing, retrieval, reranking, prompt construction, and generation.
Random Forest
An ensemble learning method that builds multiple decision trees during training and outputs the majority vote (classification) or average prediction (regression) of all the trees. The 'forest' of diverse trees is more robust than any single tree.
Reasoning
An AI model's ability to think logically, make inferences, draw conclusions, and solve problems that require multi-step thought. Reasoning goes beyond pattern matching to genuine logical analysis.
Recall
Of all the actually positive items in the dataset, the proportion that the model correctly identified. Recall measures how completely the model finds all relevant items.
Recommendation System
An AI system that predicts and suggests items a user might be interested in based on their behavior, preferences, and similarities to other users.
Recurrent Neural Network
A type of neural network designed for sequential data where the output at each step depends on previous steps. RNNs have a form of memory that allows them to process sequences like text, time series, and audio.
Red Teaming
The practice of systematically testing AI systems by attempting to find failures, vulnerabilities, and harmful behaviors before deployment. Red teamers actively try to break the system.
Regression
A type of supervised learning task where the model predicts a continuous numerical value rather than a discrete category. The output can be any number within a range.
Regularization
Techniques used to prevent overfitting by adding constraints or penalties to the model during training. Regularization discourages the model from becoming too complex or fitting noise in the training data.
Reinforcement Learning
A type of machine learning where an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties. The agent aims to maximize cumulative reward over time through trial and error.
Reinforcement Learning from AI Feedback
A variant of RLHF where AI models (instead of humans) provide the feedback used to train reward models and align language models. RLAIF reduces the cost and scalability constraints of human feedback.
Relation Extraction
The NLP task of identifying and classifying semantic relationships between entities mentioned in text. It extracts structured facts from unstructured text.
ReLU
Rectified Linear Unit — the most commonly used activation function in deep learning. It outputs the input directly if positive, and zero otherwise: f(x) = max(0, x).
Representation Learning
The process of automatically discovering useful features or representations from raw data, rather than manually engineering them. Deep learning excels at learning hierarchical representations.
Reranking
A second-stage ranking process that takes initial search results and reorders them using a more sophisticated model. Reranking improves precision by applying deeper analysis to a smaller candidate set.
Residual Connection
A shortcut that allows the input to a layer to bypass one or more layers and be added directly to the output. This enables training of much deeper networks by ensuring gradient flow.
Responsible AI
An approach to developing and deploying AI that prioritizes ethical considerations, fairness, transparency, accountability, and societal benefit throughout the entire AI lifecycle.
Responsible AI Framework
A structured set of principles, policies, processes, and tools that guide an organization's AI development and deployment to ensure ethical, fair, and beneficial outcomes.
Responsible Disclosure
The practice of reporting AI vulnerabilities, biases, or safety issues to the appropriate parties before making them public, giving developers time to fix issues before they can be exploited.
Responsible Scaling
A policy framework where AI developers commit to implementing specific safety measures as their models become more capable, with defined capability thresholds triggering additional safeguards.
Retraining
The process of training a model again on updated data to restore or improve its performance. Retraining addresses model drift and incorporates new patterns the original model did not learn.
Retrieval
The process of finding and extracting relevant information from a large collection of documents or data in response to a query. In AI systems, retrieval is often the first step before generation.
Retrieval Evaluation
Methods for measuring how well a retrieval system finds relevant documents. Key metrics include recall at K, mean reciprocal rank, and normalized discounted cumulative gain.
Retrieval Latency
The time it takes for a retrieval system to search through stored documents or embeddings and return relevant results. Measured in milliseconds, it is a critical component of RAG system performance.
Retrieval Quality
A measure of how relevant and accurate the documents retrieved by a search or RAG system are relative to the user's query. Poor retrieval quality is the leading cause of RAG failures.
Retrieval-Augmented Fine-Tuning
Combining fine-tuning with retrieval capabilities, training a model to effectively use retrieved context. RAFT teaches the model when and how to leverage external knowledge.
Retrieval-Augmented Generation
A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.
Retrieval-Augmented Reasoning
An advanced approach where an AI model interleaves retrieval with reasoning steps, fetching new information mid-reasoning rather than retrieving everything upfront.
Reward Hacking
When an AI system finds unintended ways to maximize its reward signal that do not align with the designer's actual goals. The system technically optimizes the metric but violates the spirit of the objective.
Reward Model
A model trained to predict how good a response is based on human preferences. In RLHF, the reward model scores outputs to guide the language model toward responses humans prefer.
Reward Modeling
Training a separate model to predict human preferences, which then serves as the reward signal for reinforcement learning. The reward model learns what humans consider 'good' responses.
Reward Shaping
The practice of designing intermediate rewards to guide a reinforcement learning agent toward desired behavior, rather than only providing reward at the final goal state.
Risk Assessment
The systematic process of identifying, analyzing, and evaluating potential risks associated with an AI system. Risk assessment considers both the likelihood and impact of potential harms.
RLHF
Reinforcement Learning from Human Feedback — a technique used to align language models with human preferences. Human raters rank model outputs, and this feedback trains a reward model that guides further training.
Robustness
The ability of an AI model to maintain reliable performance when faced with unexpected inputs, adversarial attacks, data distribution changes, or edge cases.
Role Prompting
A technique where the model is instructed to adopt a specific persona, expertise, or perspective in its responses. The assigned role shapes tone, depth, terminology, and reasoning approach.
S
Safety Evaluation
Systematic testing of AI models for harmful outputs, dangerous capabilities, and vulnerability to misuse. Safety evaluations assess risks before deployment.
Sampling Strategy
The method used to select the next token during text generation. Different strategies (greedy, top-k, top-p, temperature-based) produce different tradeoffs between quality and diversity.
Scaling Hypothesis
The theory that increasing model size, data, and compute will continue to improve AI capabilities predictably, and may eventually lead to artificial general intelligence.
Scaling Laws
Empirical findings showing predictable relationships between model performance and factors like model size (parameters), dataset size, and compute budget. Performance improves as a power law with these factors.
Self-Attention
A mechanism where each element in a sequence attends to all other elements to compute a representation, determining how much focus to place on each part of the input. It is the core innovation of the transformer.
Self-Consistency
A decoding strategy where the model generates multiple reasoning paths for the same question and selects the answer that appears most frequently across paths. It improves accuracy on reasoning tasks.
Self-Supervised Learning
A training approach where the model generates its own labels from the data, typically by masking or hiding parts of the input and learning to predict them. No human-annotated labels are needed.
Semantic Caching
Caching LLM responses based on the semantic meaning of queries rather than exact string matching. Semantically similar questions return cached answers, reducing latency and cost.
Semantic Chunking
An intelligent chunking strategy for RAG that splits documents based on semantic meaning rather than fixed character counts, keeping coherent topics together.
Semantic Kernel
Microsoft's open-source SDK for integrating LLMs with programming languages. It provides a framework for orchestrating AI capabilities with conventional code.
Semantic Router
A system that routes user queries to appropriate handlers based on semantic meaning rather than keyword matching. It directs traffic in AI applications.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Semantic Similarity
A measure of how similar in meaning two pieces of text are, regardless of the specific words used. Semantic similarity captures conceptual relatedness rather than lexical overlap.
Semantic Versioning
A versioning system (MAJOR.MINOR.PATCH) that conveys meaning about the underlying changes. In AI, it applies to model versions, API versions, and prompt versions.
Semantic Web
A vision for extending the World Wide Web so that data is machine-readable and interconnected through shared standards and ontologies. It enables automated reasoning and knowledge discovery.
Semi-Structured Data
Data that has some organizational structure but does not conform to a rigid schema like a relational database. Examples include JSON, XML, and HTML.
Sentence Embedding
A vector representation of an entire sentence or paragraph that captures its overall meaning. Sentence embeddings enable comparing the meanings of text passages.
Sentence Transformers
A framework for computing dense vector representations (embeddings) for sentences and paragraphs. Built on top of transformer models and optimized for semantic similarity tasks.
Sentiment Analysis
The NLP task of identifying and classifying the emotional tone or opinion expressed in text as positive, negative, or neutral. Advanced systems detect nuanced emotions like frustration, excitement, or sarcasm.
Sequence-to-Sequence
A model architecture that transforms one sequence into another, where the input and output can be different lengths. It uses an encoder to process input and a decoder to generate output.
Shadow AI
The use of unauthorized or unvetted AI tools by employees within an organization, without IT or security team knowledge or approval. Similar to shadow IT but specific to AI tools.
SHAP
SHapley Additive exPlanations — a method based on game theory that explains individual predictions by calculating each feature's contribution to the prediction. SHAP values are additive and consistent.
Sigmoid
An activation function that squashes input values into a range between 0 and 1, creating an S-shaped curve. It is commonly used for binary classification outputs and in certain neural network architectures.
Singularity
A hypothetical future point at which AI self-improvement becomes so rapid that it triggers an intelligence explosion, leading to changes so profound they are impossible to predict.
Softmax
A function that converts a vector of numbers into a probability distribution, where each value is between 0 and 1 and all values sum to 1. It is typically used as the final layer in classification models.
Sparse Attention
A variant of attention where each token only attends to a subset of other tokens rather than all of them, reducing computational cost from O(n²) to O(n√n) or O(n log n).
Sparse Model
A neural network where most parameters are zero or inactive for any given input. Sparse models achieve high capacity with lower computational cost by only using relevant parameters.
Sparse Retrieval
Information retrieval using traditional keyword matching and term frequency methods (like BM25). Called 'sparse' because document representations have mostly zero values.
Speculative Decoding
A technique that uses a small, fast model to draft multiple tokens ahead, then uses the large model to verify them in parallel. It speeds up inference without changing output quality.
Speech-to-Text
AI technology that converts spoken audio into written text (also called automatic speech recognition or ASR). Modern systems handle accents, background noise, and multiple speakers.
Stable Diffusion
An open-source text-to-image diffusion model that generates detailed images from text descriptions. It works in a compressed latent space, making it more efficient than pixel-level diffusion.
Stochastic
Involving randomness or probability. In ML, stochastic processes include random weight initialization, stochastic gradient descent, and probabilistic sampling during text generation.
Stochastic Gradient Descent
A variant of gradient descent that updates model parameters using a single random training example (or small batch) at each step instead of the entire dataset. It is faster and can escape local minima.
Streaming
Delivering LLM output token-by-token as it is generated rather than waiting for the complete response. Streaming dramatically improves perceived latency and user experience.
Structured Data
Data organized in a predefined format with clear rows and columns, like spreadsheets and relational databases. Each field has a defined type and meaning.
Structured Output
The ability of an LLM to generate responses in a specific format like JSON, XML, or a defined schema. Structured output makes AI responses parseable by other software systems.
Summarization
The NLP task of condensing a longer text into a shorter version while preserving the key information and main points. Summarization can be extractive (selecting key sentences) or abstractive (generating new text).
Supervised Learning
A type of machine learning where the model is trained on labeled data — input-output pairs where the correct answer is provided. The model learns to map inputs to outputs and can then predict outputs for new, unseen inputs.
Support Vector Machine
A classification algorithm that finds the optimal hyperplane (decision boundary) that maximizes the margin between different classes. SVMs are effective in high-dimensional spaces.
Swarm Intelligence
Collective behavior emerging from the interaction of multiple simple agents that together produce sophisticated solutions. Inspired by natural swarms like ant colonies, bee hives, and bird flocks.
Symbolic AI
An approach to AI that represents knowledge using symbols and rules, and reasons by manipulating those symbols logically. Symbolic AI dominated before the deep learning era.
Synthetic Benchmark
A benchmark composed of artificially generated or carefully curated evaluation tasks designed to test specific AI capabilities, rather than using naturally occurring data.
Synthetic Data
Artificially generated data that mimics the statistical properties and patterns of real data. It is created using algorithms, simulations, or generative models rather than collected from real-world events.
Synthetic Data Generation
The process of using algorithms, rules, or generative models to create artificial datasets that statistically mirror real data. Used when real data is scarce, sensitive, or biased.
Synthetic Evaluation
Using AI models to evaluate other AI models, generating test cases and scoring outputs automatically. This scales evaluation beyond what human evaluation alone can achieve.
Synthetic Media
AI-generated or AI-manipulated content including images, audio, video, and text that can be difficult to distinguish from authentic content. This includes deepfakes and AI-generated voices.
Synthetic Reasoning Data
Training data specifically generated to improve AI reasoning capabilities, often using techniques like chain-of-thought examples, math problems, and logical puzzles.
System Prompt
Hidden instructions provided to an LLM that define its behavior, personality, constraints, and capabilities for a conversation. System prompts set the rules of engagement before the user interacts.
T
Temperature
A parameter that controls the randomness or creativity of an LLM's output. Lower temperatures (closer to 0) make outputs more deterministic and focused; higher temperatures increase randomness and creativity.
Tensor
A multi-dimensional array of numbers — the fundamental data structure in deep learning. Scalars are 0D tensors, vectors are 1D, matrices are 2D, and higher-dimensional arrays are nD tensors.
Test Data
A separate portion of data held back from training that is used to evaluate a model's performance on unseen examples. Test data provides an unbiased estimate of how well the model will perform in the real world.
Test-Time Compute
Allocating additional computation during inference (not training) to improve output quality. Techniques include chain-of-thought, self-consistency, and iterative refinement.
Text Classification
The NLP task of assigning predefined categories or labels to text documents. It is one of the most common and commercially important NLP applications.
Text Mining
The process of deriving meaningful patterns, trends, and insights from large collections of text data using NLP and statistical techniques.
Text-to-Image
AI models that generate visual images from natural language text descriptions (prompts). This technology converts written descriptions into original images, illustrations, or photorealistic visuals.
Text-to-Speech
AI technology that converts written text into natural-sounding human speech. Modern TTS systems can generate voices with realistic intonation, emotion, and even clone specific voices.
TF-IDF
Term Frequency-Inverse Document Frequency — a statistical measure that evaluates how important a word is to a document within a collection. Words frequent in one document but rare across documents score high.
Throughput
The number of requests or predictions a model can process in a given time period. High throughput means the system can serve many users simultaneously.
Token
The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. Text is broken into tokens before being fed into an LLM, and the model generates output one token at a time.
Token Economy
The broader economic ecosystem around AI tokens including pricing models, cost optimization strategies, and the financial dynamics of building AI-powered products.
Token Limit
The maximum number of tokens a model can process in a single request, including both the input prompt and the generated output. Exceeding the limit results in truncated input or errors.
Tokenization
The process of breaking text into smaller units (tokens) for processing by NLP models. Tokenization can split text into words, subwords, or characters depending on the method used.
Tokenization Strategy
The approach and rules for how text is split into tokens. Different strategies (word-level, subword, character-level) make different tradeoffs between vocabulary size and sequence length.
Tokenizer
A component that converts raw text into tokens (numerical representations) that a language model can process. Different tokenizers split text differently, affecting model performance and efficiency.
Tokenizer Efficiency
How effectively a tokenizer represents text — measured by the average number of tokens needed to represent a given amount of text. More efficient tokenizers produce fewer tokens for the same content.
Tokenizer Training
The process of building a tokenizer's vocabulary from a corpus of text. The tokenizer learns which subword units to use based on frequency patterns in the training corpus.
Tokenizer Vocabulary
The complete set of tokens (words, subwords, characters) that a tokenizer can recognize and map to numerical IDs. Vocabulary size affects model efficiency and multilingual capability.
Tokenomics
The economic framework around token-based pricing for AI API services, including cost per token, input vs output pricing, and optimization strategies.
Tokenomics of AI
The economics of token-based pricing in AI APIs, including cost per input/output token, strategies for cost optimization, and the financial implications of different model choices.
Tool Use
The ability of an AI model to interact with external tools, APIs, and systems to accomplish tasks beyond text generation. Tools extend the model's capabilities to include search, calculation, code execution, and more.
Top-k Sampling
A text generation method where the model only considers the k most likely next tokens at each step, ignoring all others. This limits the pool of candidates to the most probable options.
Top-p Sampling
A text generation method (also called nucleus sampling) where the model considers only the smallest set of tokens whose cumulative probability exceeds the threshold p. This balances diversity and quality.
Topic Modeling
An unsupervised technique that automatically discovers abstract themes (topics) in a collection of documents. Each document is represented as a mixture of topics.
TPU
Tensor Processing Unit — Google's custom-designed chip specifically optimized for machine learning workloads. TPUs are designed for matrix operations that are fundamental to neural network computation.
Training Data
The dataset used to teach a machine learning model. It contains examples (and often labels) that the model learns patterns from during the training process. The quality and quantity of training data directly impact model performance.
Training-Serving Skew
A discrepancy between how features are computed during model training versus how they are computed during production serving. This is one of the most common and hardest-to-detect causes of model failure.
Transfer Learning
A technique where a model trained on one task is repurposed as the starting point for a model on a different but related task. Instead of training from scratch, you leverage knowledge the model has already acquired.
Transformer
A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.
Transformer Architecture
The full stack of components that make up a transformer model: multi-head self-attention, feed-forward networks, layer normalization, residual connections, and positional encodings.
Transparency
The principle that AI systems should operate in a way that allows stakeholders to understand how they work, what data they use, and how decisions are made.
Tree of Thought
A prompting framework where the model explores multiple reasoning branches, evaluates intermediate states, and can backtrack from dead ends — like a deliberate tree search through thought space.
Trustworthy AI
AI systems that are reliable, fair, transparent, private, secure, and accountable. Trustworthy AI meets both technical standards and ethical requirements for safe deployment.
U
Underfitting
When a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and new data. The model has not learned enough from the training data.
Unstructured Data
Data without a predefined format or organization — text documents, images, videos, audio, social media posts. Over 80% of enterprise data is unstructured.
Unsupervised Learning
A type of machine learning where the model learns patterns from unlabeled data without being told what the correct output should be. The algorithm discovers hidden structures, groupings, or patterns in the data on its own.
V
Validation Data
A subset of data used during training to tune hyperparameters and monitor model performance without touching the test set. It acts as an intermediate checkpoint between training and final evaluation.
Vanishing Gradient Problem
A training difficulty in deep networks where gradients become exponentially smaller as they are propagated back through many layers, making it nearly impossible for early layers to learn.
Variational Autoencoder
A generative model that learns a compressed, lower-dimensional representation (latent space) of input data and can generate new data by sampling from this learned space.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Vector Search
The process of finding the most similar vectors in a vector database to a given query vector. It enables retrieving semantically similar content at scale.
Vision-Language Model
An AI model that can process both visual and textual inputs, understanding images and generating text about them. VLMs combine computer vision with language understanding.
Voice Cloning
AI technology that creates a synthetic replica of a specific person's voice from a small sample of their speech. Cloned voices can speak any text in the original person's vocal characteristics.
W
Watermarking
Embedding hidden, detectable signals in AI-generated content to identify its origin and authenticity. Watermarks help distinguish AI-generated content from human-created content.
Weight
A numerical parameter in a neural network that is learned during training. Weights determine the strength of connections between neurons and collectively encode the model's knowledge.
Weights and Biases
A popular MLOps platform for experiment tracking, model monitoring, dataset versioning, and collaboration in machine learning development.
Whisper
OpenAI's open-source automatic speech recognition model that can transcribe and translate speech in multiple languages with high accuracy.
Word Embedding
A technique that maps words to dense numerical vectors where semantic relationships are captured. Similar words have similar vectors, and relationships like analogy are encoded in vector arithmetic.
Z
Zero-Shot Classification
Classifying text into categories that the model was never explicitly trained on, using only the category names or descriptions as guidance.
Zero-Shot Learning
A model's ability to perform a task it was never explicitly trained on or shown examples of. The model applies its general knowledge and reasoning to handle entirely new task types.
Zero-Shot Prompting
Giving an LLM a task instruction without any examples, relying entirely on the model's pre-trained knowledge and instruction-following ability to perform the task.