Scikit-learn Flashcards

πŸ” Scikit-learn Flashcards
πŸ” What is Scikit-learn?
A popular Python library for machine learning, offering simple and efficient tools for data mining and analysis.

πŸ“Š Which algorithms does it support?
Supports classification, regression, clustering, dimensionality reduction, and model selection algorithms.

πŸ“¦ What data format does it accept?
Primarily uses NumPy arrays, Pandas DataFrames, and SciPy sparse matrices for input and output.

πŸ§ͺ How is it different from TensorFlow?
Scikit-learn is ideal for traditional ML algorithms; TensorFlow is better for deep learning and neural networks.

βš™οΈ What is Pipeline?
A feature to chain preprocessing and modeling steps into a single workflow object, enabling cleaner code and cross-validation.

πŸ”„ What is cross-validation?
A model evaluation method where data is split into multiple train/test sets to test robustness and avoid overfitting.

πŸ“Œ What are estimators?
Any object in scikit-learn that implements the fit() and predict() methods (like classifiers, regressors, etc.).

🏷️ What is Label Encoding?
Converts categorical values into numeric values using scikit-learn’s LabelEncoder.

πŸ“ What is StandardScaler?
A preprocessing technique to scale features by removing the mean and scaling to unit variance.

🧩 Is it open-source?
Yes! Scikit-learn is free and open-source, under the BSD license, and widely supported by the community.