Artificial Intelligence

Top-k Sampling

A text generation method where the model only considers the k most likely next tokens at each step, ignoring all others. This limits the pool of candidates to the most probable options.

Why It Matters

Top-k sampling prevents the model from selecting wildly improbable tokens while still allowing creative variation within the top candidates.

Example

With top-k = 50, the model only considers the 50 most likely next tokens at each generation step, regardless of their probabilities.

Think of it like...

Like a hiring manager who only interviews the top 50 applicants — it is a simple cutoff that ensures quality while still allowing choice.

Related Terms

Temperature

A parameter that controls the randomness or creativity of an LLM's output. Lower temperatures (closer to 0) make outputs more deterministic and focused; higher temperatures increase randomness and creativity.

Greedy Decoding

A simple text generation strategy where the model always selects the most probable next token at each step. It is fast but can produce repetitive or suboptimal outputs.

Back to Glossary