Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.
Why It Matters
Bi-encoders power real-time semantic search — documents are encoded once and stored, and only the query needs encoding at search time.
Example
Pre-computing embeddings for 1 million documents and storing them. When a query arrives, encode it (once) and find the most similar pre-computed document vectors in milliseconds.
Think of it like...
Like two people independently summarizing a movie and comparing their summaries, versus watching the movie together — independent encoding is faster but less nuanced.
Related Terms
Cross-Encoder
A model that takes two texts as input simultaneously and outputs a relevance or similarity score. Unlike bi-encoders, cross-encoders consider the full interaction between both texts.
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Sentence Transformers
A framework for computing dense vector representations (embeddings) for sentences and paragraphs. Built on top of transformer models and optimized for semantic similarity tasks.