Instruction Tuning
A fine-tuning approach where a model is trained on a dataset of instruction-response pairs, teaching it to follow human instructions accurately. This transforms a text-completion model into a helpful assistant.
Why It Matters
Instruction tuning is what transforms a raw language model into something like ChatGPT or Claude — a model that understands and follows user requests.
Example
Training a model on thousands of examples like 'Summarize this article: [article]' → [summary], teaching it to follow the instruction format.
Think of it like...
Like training a new employee not just to know information but to respond appropriately to requests — 'When someone asks for X, do Y.'
Related Terms
Fine-Tuning
The process of taking a pre-trained model and further training it on a smaller, domain-specific dataset to specialize its behavior for a particular task or domain. Fine-tuning adjusts the model's weights to improve performance on the target task.
RLHF
Reinforcement Learning from Human Feedback — a technique used to align language models with human preferences. Human raters rank model outputs, and this feedback trains a reward model that guides further training.
Alignment
The challenge of ensuring AI systems behave in ways that match human values, intentions, and expectations. Alignment aims to make AI helpful, honest, and harmless.