Greedy Decoding
A simple text generation strategy where the model always selects the most probable next token at each step. It is fast but can produce repetitive or suboptimal outputs.
Why It Matters
Greedy decoding is the fastest generation method but often misses better overall sequences. Understanding its limitations explains why more sophisticated methods exist.
Example
The model always picks the highest-probability word: 'The' → 'cat' → 'is' → 'a' → 'cat' — notice how it can get stuck in repetitive loops.
Think of it like...
Like always taking the highway at every junction because it looks fastest — sometimes a side road leads to a much better overall route.
Related Terms
Beam Search
A search algorithm used in text generation that explores multiple possible output sequences simultaneously, keeping the top-scoring candidates at each step. It finds higher-quality outputs than greedy decoding.
Temperature
A parameter that controls the randomness or creativity of an LLM's output. Lower temperatures (closer to 0) make outputs more deterministic and focused; higher temperatures increase randomness and creativity.