XGBoost Flashcards

๐Ÿš€ XGBoost Flashcards

Extreme Gradient Boosting for fast and accurate decision tree models

โšก What is XGBoost?

XGBoost is an optimized gradient boosting algorithm used for supervised learning problems, designed for performance and efficiency.

๐ŸŽฏ Primary Strength

High speed and accuracy on structured/tabular data; outperforms many models in ML competitions.

๐Ÿ“ฆ Installation

Install via pip install xgboost or conda. GPU support requires pip install xgboost[dask].

๐Ÿง  Use Cases

Widely used in classification, regression, ranking, fraud detection, and Kaggle competitions.

โš™๏ธ Important Parameters

n_estimators, max_depth, eta, gamma, subsample, colsample_bytree.

๐Ÿ” Evaluation Metrics

Supports AUC, RMSE, MAE, Logloss, and custom evaluation functions.

๐Ÿงช Regularization

Includes L1 (alpha) and L2 (lambda) regularization to avoid overfitting.

๐Ÿ“ˆ Early Stopping

Stop training when no improvement is seen after a number of rounds on validation set.

๐Ÿ’พ Save & Load

Save model with model.save_model(), load with model.load_model().

๐Ÿงฉ scikit-learn API

Use XGBClassifier and XGBRegressor seamlessly with pipelines and GridSearchCV.

โšก GPU Support

Accelerate training using tree_method="gpu_hist" for large datasets.

๐Ÿ”— Dask Integration

Distributed training support for big data workloads using Dask API.