Softmax
A function that converts a vector of numbers into a probability distribution, where each value is between 0 and 1 and all values sum to 1. It is typically used as the final layer in classification models.
Why It Matters
Softmax gives you not just a prediction but a confidence level for each possible class, enabling better decision-making in applications like medical diagnosis.
Example
A model outputs raw scores [2.0, 1.0, 0.1] for three classes. Softmax converts these to probabilities [0.66, 0.24, 0.10], showing 66% confidence in class 1.
Think of it like...
Like converting test scores into percentages of the total — it shows each option's relative likelihood compared to all others.
Related Terms
Activation Function
A mathematical function applied to the output of each neuron in a neural network that introduces non-linearity. Without activation functions, a neural network would just be a series of linear transformations.
Classification
A type of supervised learning task where the model predicts which category or class an input belongs to. The output is a discrete label rather than a continuous value.
Cross-Entropy
A loss function commonly used in classification tasks that measures the difference between the predicted probability distribution and the actual distribution. Lower cross-entropy means better predictions.
Logistic Regression
A classification algorithm that uses the sigmoid function to predict the probability of a binary outcome. Despite its name containing 'regression,' it is used for classification tasks.