Chunking
The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.
Why It Matters
Chunking strategy directly determines RAG quality — too large and you retrieve irrelevant content, too small and you lose context. Getting it right is critical.
Example
Splitting a 100-page manual into overlapping 500-token chunks so that each chunk contains enough context to be useful when retrieved for answering questions.
Think of it like...
Like cutting a pizza — too few large slices and each is unwieldy, too many tiny pieces and you lose the toppings arrangement. The right size makes each piece perfect.
Related Terms
Retrieval-Augmented Generation
A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.