K-Means
A clustering algorithm that partitions data into K groups by iteratively assigning each data point to the nearest cluster center and then recalculating the centers. K must be specified in advance.
Why It Matters
K-means is the most widely used clustering algorithm due to its simplicity and efficiency. It is the go-to first step for customer segmentation and data exploration.
Example
Grouping 10,000 retail customers into 5 segments based on purchase frequency, average order value, and product category preferences.
Think of it like...
Like a teacher dividing students into groups — they pick group centers, assign the nearest students to each, then adjust the centers based on who ended up where, repeating until stable.
Related Terms
Clustering
An unsupervised learning technique that groups similar data points together based on their characteristics, without predefined labels. The algorithm discovers natural groupings in the data.
Unsupervised Learning
A type of machine learning where the model learns patterns from unlabeled data without being told what the correct output should be. The algorithm discovers hidden structures, groupings, or patterns in the data on its own.