AI Glossary

The definitive dictionary for AI, Machine Learning, and Governance terminology. From Flash Attention to RAG — look up any term.

T

Temperature

A parameter that controls the randomness or creativity of an LLM's output. Lower temperatures (closer to 0) make outputs more deterministic and focused; higher temperatures increase randomness and creativity.

Artificial Intelligence

Tensor

A multi-dimensional array of numbers — the fundamental data structure in deep learning. Scalars are 0D tensors, vectors are 1D, matrices are 2D, and higher-dimensional arrays are nD tensors.

Machine Learning

Test Data

A separate portion of data held back from training that is used to evaluate a model's performance on unseen examples. Test data provides an unbiased estimate of how well the model will perform in the real world.

Data Science

Test-Time Compute

Allocating additional computation during inference (not training) to improve output quality. Techniques include chain-of-thought, self-consistency, and iterative refinement.

Artificial Intelligence

Text Classification

The NLP task of assigning predefined categories or labels to text documents. It is one of the most common and commercially important NLP applications.

Artificial Intelligence

Text Mining

The process of deriving meaningful patterns, trends, and insights from large collections of text data using NLP and statistical techniques.

Artificial Intelligence

Text-to-Image

AI models that generate visual images from natural language text descriptions (prompts). This technology converts written descriptions into original images, illustrations, or photorealistic visuals.

Artificial Intelligence

Text-to-Speech

AI technology that converts written text into natural-sounding human speech. Modern TTS systems can generate voices with realistic intonation, emotion, and even clone specific voices.

Artificial Intelligence

TF-IDF

Term Frequency-Inverse Document Frequency — a statistical measure that evaluates how important a word is to a document within a collection. Words frequent in one document but rare across documents score high.

Machine Learning

Throughput

The number of requests or predictions a model can process in a given time period. High throughput means the system can serve many users simultaneously.

Artificial Intelligence

Token

The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. Text is broken into tokens before being fed into an LLM, and the model generates output one token at a time.

Artificial Intelligence

Token Economy

The broader economic ecosystem around AI tokens including pricing models, cost optimization strategies, and the financial dynamics of building AI-powered products.

General

Token Limit

The maximum number of tokens a model can process in a single request, including both the input prompt and the generated output. Exceeding the limit results in truncated input or errors.

Artificial Intelligence

Tokenization

The process of breaking text into smaller units (tokens) for processing by NLP models. Tokenization can split text into words, subwords, or characters depending on the method used.

Artificial Intelligence

Tokenization Strategy

The approach and rules for how text is split into tokens. Different strategies (word-level, subword, character-level) make different tradeoffs between vocabulary size and sequence length.

Artificial Intelligence

Tokenizer

A component that converts raw text into tokens (numerical representations) that a language model can process. Different tokenizers split text differently, affecting model performance and efficiency.

Artificial Intelligence

Tokenizer Efficiency

How effectively a tokenizer represents text — measured by the average number of tokens needed to represent a given amount of text. More efficient tokenizers produce fewer tokens for the same content.

Artificial Intelligence

Tokenizer Training

The process of building a tokenizer's vocabulary from a corpus of text. The tokenizer learns which subword units to use based on frequency patterns in the training corpus.

Artificial Intelligence

Tokenizer Vocabulary

The complete set of tokens (words, subwords, characters) that a tokenizer can recognize and map to numerical IDs. Vocabulary size affects model efficiency and multilingual capability.

Artificial Intelligence

Tokenomics

The economic framework around token-based pricing for AI API services, including cost per token, input vs output pricing, and optimization strategies.

General

Tokenomics of AI

The economics of token-based pricing in AI APIs, including cost per input/output token, strategies for cost optimization, and the financial implications of different model choices.

General

Tool Use

The ability of an AI model to interact with external tools, APIs, and systems to accomplish tasks beyond text generation. Tools extend the model's capabilities to include search, calculation, code execution, and more.

Artificial Intelligence

Top-k Sampling

A text generation method where the model only considers the k most likely next tokens at each step, ignoring all others. This limits the pool of candidates to the most probable options.

Artificial Intelligence

Top-p Sampling

A text generation method (also called nucleus sampling) where the model considers only the smallest set of tokens whose cumulative probability exceeds the threshold p. This balances diversity and quality.

Artificial Intelligence

Topic Modeling

An unsupervised technique that automatically discovers abstract themes (topics) in a collection of documents. Each document is represented as a mixture of topics.

Machine Learning

TPU

Tensor Processing Unit — Google's custom-designed chip specifically optimized for machine learning workloads. TPUs are designed for matrix operations that are fundamental to neural network computation.

Artificial Intelligence

Training Data

The dataset used to teach a machine learning model. It contains examples (and often labels) that the model learns patterns from during the training process. The quality and quantity of training data directly impact model performance.

Data Science

Training-Serving Skew

A discrepancy between how features are computed during model training versus how they are computed during production serving. This is one of the most common and hardest-to-detect causes of model failure.

Machine Learning

Transfer Learning

A technique where a model trained on one task is repurposed as the starting point for a model on a different but related task. Instead of training from scratch, you leverage knowledge the model has already acquired.

Machine Learning

Transformer

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.

Artificial Intelligence

Transformer Architecture

The full stack of components that make up a transformer model: multi-head self-attention, feed-forward networks, layer normalization, residual connections, and positional encodings.

Artificial Intelligence

Transparency

The principle that AI systems should operate in a way that allows stakeholders to understand how they work, what data they use, and how decisions are made.

AI Governance

Tree of Thought

A prompting framework where the model explores multiple reasoning branches, evaluates intermediate states, and can backtrack from dead ends — like a deliberate tree search through thought space.

Artificial Intelligence

Trustworthy AI

AI systems that are reliable, fair, transparent, private, secure, and accountable. Trustworthy AI meets both technical standards and ethical requirements for safe deployment.

AI Governance