Cosine Similarity
A metric that measures the similarity between two vectors by calculating the cosine of the angle between them. Values range from -1 (opposite) to 1 (identical), with 0 meaning unrelated.
Why It Matters
Cosine similarity is the standard metric for comparing embeddings in RAG, semantic search, and recommendation systems. It determines what content is 'related'.
Example
Comparing embeddings: 'dog' and 'puppy' might have 0.92 similarity, 'dog' and 'cat' might be 0.75, while 'dog' and 'algebra' might be 0.15.
Think of it like...
Like comparing the direction two arrows point, ignoring their length — arrows pointing the same way are similar, regardless of how long they are.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.