Artificial Intelligence

Inference

The process of using a trained model to make predictions on new, previously unseen data. Inference is what happens when an AI model is deployed and actively serving results to users.

Why It Matters

Inference speed and cost determine the viability of AI applications in production. A model that is accurate but too slow or expensive to run is impractical.

Example

When you type a query into ChatGPT and receive a response, the model is performing inference — applying its learned knowledge to your specific input.

Think of it like...

Like the difference between studying for a test (training) and taking the test (inference) — you use what you learned to answer new questions.

Inference

Why It Matters

Example

Think of it like...

Related Terms

Latency

Throughput

Model Serving

Edge Inference