Retrieval-Augmented Generation
A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.
Why It Matters
RAG solves the knowledge cutoff problem, reduces hallucinations, and lets organizations use LLMs with their proprietary data without fine-tuning. It is the most popular enterprise AI pattern.
Example
A customer support chatbot that searches your company's knowledge base for relevant articles before generating an answer, ensuring responses are grounded in actual documentation.
Think of it like...
Like a student who is allowed to use their textbook during an exam — they combine their understanding with referenced material to give more accurate, well-supported answers.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Chunking
The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.
Grounding
The practice of connecting AI model outputs to verifiable sources of information, ensuring responses are based on factual data rather than the model's potentially unreliable internal knowledge.
Hallucination
When an AI model generates information that sounds plausible and confident but is factually incorrect, fabricated, or not grounded in its training data or provided context. The model essentially 'makes things up'.