LightGBM
Light Gradient Boosting Machine — Microsoft's gradient boosting framework optimized for speed and efficiency. LightGBM uses histogram-based splitting and leaf-wise growth for faster training.
Why It Matters
LightGBM trains significantly faster than XGBoost on large datasets while achieving comparable accuracy, making it preferred for large-scale production systems.
Example
Processing a dataset with 100 million rows and 500 features in minutes rather than hours, enabling rapid iteration during model development.
Think of it like...
Like a sports car version of gradient boosting — same destination (accurate predictions), but gets there much faster.
Related Terms
XGBoost
Extreme Gradient Boosting — an optimized implementation of gradient boosting that is fast, accurate, and the most winning algorithm in machine learning competitions on tabular data.
Gradient Boosting
An ensemble technique that builds models sequentially, where each new model focuses on correcting the errors made by previous models. It combines many weak learners into a single strong learner.
CatBoost
A gradient boosting library by Yandex that handles categorical features natively without requiring manual encoding. CatBoost also addresses prediction shift and target leakage.
Decision Tree
A supervised learning algorithm that makes predictions by learning a series of if-then-else decision rules from the data. It creates a tree-like structure where each internal node tests a feature and each leaf provides a prediction.
Ensemble Learning
A strategy that combines multiple models to produce better predictions than any single model alone. Ensemble methods leverage the diversity of different models to reduce errors.