π Scikit-learn Flashcards
π What is Scikit-learn?
A popular Python library for machine learning, offering simple and efficient tools for data mining and analysis.
π Which algorithms does it support?
Supports classification, regression, clustering, dimensionality reduction, and model selection algorithms.
π¦ What data format does it accept?
Primarily uses NumPy arrays, Pandas DataFrames, and SciPy sparse matrices for input and output.
π§ͺ How is it different from TensorFlow?
Scikit-learn is ideal for traditional ML algorithms; TensorFlow is better for deep learning and neural networks.
βοΈ What is Pipeline?
A feature to chain preprocessing and modeling steps into a single workflow object, enabling cleaner code and cross-validation.
π What is cross-validation?
A model evaluation method where data is split into multiple train/test sets to test robustness and avoid overfitting.
π What are estimators?
Any object in scikit-learn that implements the fit() and predict() methods (like classifiers, regressors, etc.).
π·οΈ What is Label Encoding?
Converts categorical values into numeric values using scikit-learnβs LabelEncoder.
π What is StandardScaler?
A preprocessing technique to scale features by removing the mean and scaling to unit variance.
π§© Is it open-source?
Yes! Scikit-learn is free and open-source, under the BSD license, and widely supported by the community.