Artificial Intelligence

Token Limit

The maximum number of tokens a model can process in a single request, including both the input prompt and the generated output. Exceeding the limit results in truncated input or errors.

Why It Matters

Token limits constrain what tasks an LLM can handle. Understanding and working within these limits is essential for building reliable LLM applications.

Example

GPT-4 Turbo with a 128K token limit can process roughly a 300-page book, while older models with 4K limits could only handle about 3 pages.

Think of it like...

Like the weight limit on an elevator — you can fit more people or cargo, but there is a hard maximum. Going over means something gets left behind.

Related Terms