Embedding Model
A specialized model designed to convert text, images, or other data into vector embeddings. Embedding models are optimized for producing meaningful numerical representations rather than generating text.
Why It Matters
The choice of embedding model determines RAG quality, search relevance, and recommendation accuracy. A poor embedding model means poor retrieval regardless of the LLM.
Example
OpenAI's text-embedding-3-large, Cohere's embed-v3, or open-source models like BGE and E5 that convert text passages into 1024-dimensional vectors.
Think of it like...
Like a translator who converts any document into a universal language of numbers — the better the translator, the more meaning is preserved in the conversion.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Retrieval-Augmented Generation
A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.
Sentence Transformers
A framework for computing dense vector representations (embeddings) for sentences and paragraphs. Built on top of transformer models and optimized for semantic similarity tasks.
Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.