Sentence Embedding
A vector representation of an entire sentence or paragraph that captures its overall meaning. Sentence embeddings enable comparing the meanings of text passages.
Why It Matters
Sentence embeddings are the building blocks of semantic search, RAG, clustering, and text similarity — they convert meaning into mathematics.
Example
Embedding 'The weather is nice today' into a 768-dimensional vector that is close to 'It is a beautiful day outside' but far from 'The stock market crashed.'
Think of it like...
Like converting a sentence into GPS coordinates on a map of meaning — similar sentences are at nearby coordinates, unrelated ones are far apart.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Sentence Transformers
A framework for computing dense vector representations (embeddings) for sentences and paragraphs. Built on top of transformer models and optimized for semantic similarity tasks.
Semantic Similarity
A measure of how similar in meaning two pieces of text are, regardless of the specific words used. Semantic similarity captures conceptual relatedness rather than lexical overlap.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.