Cross-Encoder
A model that takes two texts as input simultaneously and outputs a relevance or similarity score. Unlike bi-encoders, cross-encoders consider the full interaction between both texts.
Why It Matters
Cross-encoders produce much more accurate similarity scores than bi-encoders but are slower. They are ideal for reranking where accuracy on a small set matters most.
Example
A cross-encoder taking a query 'best restaurants in NYC' and a passage about dining in New York simultaneously, outputting a relevance score of 0.94.
Think of it like...
Like a teacher who reads both the question and the student's answer together before scoring, versus one who scores the answer without seeing the question.
Related Terms
Reranking
A second-stage ranking process that takes initial search results and reorders them using a more sophisticated model. Reranking improves precision by applying deeper analysis to a smaller candidate set.
Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Retrieval
The process of finding and extracting relevant information from a large collection of documents or data in response to a query. In AI systems, retrieval is often the first step before generation.
Sentence Transformers
A framework for computing dense vector representations (embeddings) for sentences and paragraphs. Built on top of transformer models and optimized for semantic similarity tasks.