โก LightGBM Flashcards
A gradient boosting framework that uses tree-based learning algorithms
๐ณ What is LightGBM?
A high-performance gradient boosting framework optimized for speed and accuracy, especially on large datasets.
๐ Speed Advantage
Uses histogram-based algorithms and leaf-wise tree growth, making it significantly faster than XGBoost in many cases.
๐ Dataset Compatibility
Best suited for large datasets with thousands of features. Handles both categorical and numerical features efficiently.
๐งช Use Cases
Fraud detection, click-through rate prediction, ranking tasks, and any tabular supervised learning problem.
โ๏ธ Core Features
Supports parallel learning, efficient memory usage, GPU training, and early stopping.
๐ฆ Installation
Install with pip install lightgbm
. Requires CMake and compilers for advanced builds.
๐ Tuning Parameters
Key parameters: num_leaves
, learning_rate
, n_estimators
, max_depth
.
๐ Overfitting Control
Use max_depth
, min_data_in_leaf
, feature_fraction
, and bagging_fraction
to reduce overfitting.
๐ File Format
Model can be saved in text or binary format using booster.save_model()
method.
๐งฎ Categorical Features
Use categorical_feature
argument to handle non-numeric features natively without one-hot encoding.
๐ป GPU Support
Enable with device = 'gpu'
for faster training on large datasets.
๐ Integration
Integrates with scikit-learn API and supports use with libraries like Optuna and MLflow for hyperparameter tuning and tracking.