Embedding Dimension
The number of numerical values in a vector embedding. Higher dimensions can capture more nuanced relationships but require more storage and computation.
Why It Matters
Embedding dimension is a key architecture choice — too few dimensions lose information, too many waste resources. Common dimensions range from 384 to 3072.
Example
OpenAI's text-embedding-3-small uses 1536 dimensions, while text-embedding-3-large uses 3072 — more dimensions for more nuanced semantic understanding.
Think of it like...
Like the resolution of a photograph — more pixels capture finer detail, but the file gets bigger and takes longer to process.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Embedding Model
A specialized model designed to convert text, images, or other data into vector embeddings. Embedding models are optimized for producing meaningful numerical representations rather than generating text.
Dimensionality Reduction
Techniques that reduce the number of features (dimensions) in a dataset while preserving the most important information. This makes data easier to visualize, speeds up training, and can improve model performance.
Representation Learning
The process of automatically discovering useful features or representations from raw data, rather than manually engineering them. Deep learning excels at learning hierarchical representations.