LightGBM Flashcards | Uplatz Blog

⚡ LightGBM Flashcards

A gradient boosting framework that uses tree-based learning algorithms

A high-performance gradient boosting framework optimized for speed and accuracy, especially on large datasets.

Uses histogram-based algorithms and leaf-wise tree growth, making it significantly faster than XGBoost in many cases.

Best suited for large datasets with thousands of features. Handles both categorical and numerical features efficiently.

Fraud detection, click-through rate prediction, ranking tasks, and any tabular supervised learning problem.

Supports parallel learning, efficient memory usage, GPU training, and early stopping.

Install with pip install lightgbm. Requires CMake and compilers for advanced builds.

Key parameters: num_leaves, learning_rate, n_estimators, max_depth.

Use max_depth, min_data_in_leaf, feature_fraction, and bagging_fraction to reduce overfitting.

Model can be saved in text or binary format using booster.save_model() method.

Use categorical_feature argument to handle non-numeric features natively without one-hot encoding.

Enable with device = 'gpu' for faster training on large datasets.

Integrates with scikit-learn API and supports use with libraries like Optuna and MLflow for hyperparameter tuning and tracking.