Overview
LightGBM (Light Gradient-Boosting Machine) is a high-performance, distributed gradient-boosting framework created by Microsoft.
It employs a histogram-based decision-tree algorithm with leaf-wise growth, achieving significant speed-ups and lower memory usage compared with level-wise GBDT approaches.
Key features
- Speed & Efficiency – histogram binning, multi-threading and out-of-core learning dramatically reduce training time and RAM.
- Advanced Algorithms – supports GBDT, GOSS, DART, Random Forest and LambdaRank.
- Scalability – built-in distributed training (MPI / Rabit style) and GPU acceleration.
- Categorical Support – handles categorical variables natively, no one-hot encoding required.
- Language Bindings – C++ core with first-class Python (scikit-learn API), R, C#, Julia and CLI interfaces.
- Techniques – Gradient-Based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) further boost accuracy and speed.
Typical use-cases
Widely adopted for click-through-rate prediction, recommendation systems, fraud detection, time-series forecasting and Kaggle competitions where fast iteration on large tabular data is critical.
Ecosystem & licensing
Hosted on GitHub under the MIT License, LightGBM is actively maintained, distributed via PyPI / CRAN / conda and documented on Read the Docs.