Artificial Intelligence

Instruction Hierarchy

A framework for prioritizing different levels of instructions when they conflict — system prompts typically override user prompts, which override context from retrieved documents.

Why It Matters

Instruction hierarchy prevents prompt injection attacks by establishing clear priority rules for whose instructions the model follows.

Example

System prompt (highest priority) → developer instructions → user instructions → content from retrieved documents (lowest priority) — so injected instructions in documents cannot override safety rules.

Think of it like...

Like a chain of command in the military — orders from higher ranks override conflicting orders from lower ranks, maintaining organizational control.

Related Terms