๐ XGBoost Flashcards
Extreme Gradient Boosting for fast and accurate decision tree models
โก What is XGBoost?
XGBoost is an optimized gradient boosting algorithm used for supervised learning problems, designed for performance and efficiency.
๐ฏ Primary Strength
High speed and accuracy on structured/tabular data; outperforms many models in ML competitions.
๐ฆ Installation
Install via pip install xgboost
or conda. GPU support requires pip install xgboost[dask]
.
๐ง Use Cases
Widely used in classification, regression, ranking, fraud detection, and Kaggle competitions.
โ๏ธ Important Parameters
n_estimators
, max_depth
, eta
, gamma
, subsample
, colsample_bytree
.
๐ Evaluation Metrics
Supports AUC, RMSE, MAE, Logloss, and custom evaluation functions.
๐งช Regularization
Includes L1 (alpha) and L2 (lambda) regularization to avoid overfitting.
๐ Early Stopping
Stop training when no improvement is seen after a number of rounds on validation set.
๐พ Save & Load
Save model with model.save_model()
, load with model.load_model()
.
๐งฉ scikit-learn API
Use XGBClassifier
and XGBRegressor
seamlessly with pipelines and GridSearchCV.
โก GPU Support
Accelerate training using tree_method="gpu_hist"
for large datasets.
๐ Dask Integration
Distributed training support for big data workloads using Dask API.