Artificial Intelligence

Transformer

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.

Why It Matters

The transformer architecture revolutionized NLP and is now expanding to vision, audio, and multimodal AI. It is the most important architecture in modern AI.

Example

GPT-4, Claude, BERT, and virtually every modern language model is built on the transformer architecture.

Think of it like...

Like a speed reader who can look at an entire page at once and understand how every word relates to every other word, instead of reading one word at a time left to right.

Related Terms