Data Science Flashcards

Data Science
An interdisciplinary field combining statistics, computer science, and domain knowledge to extract insights from data.

Data Wrangling
The process of cleaning, transforming, and preparing raw data into a usable format for analysis.

Exploratory Data Analysis (EDA)
A preliminary step in data analysis used to summarize main characteristics using statistics and visualizations.

Feature Engineering
Creating new features or modifying existing ones to improve the performance of machine learning models.

Statistical Inference
Drawing conclusions about populations from sample data using hypothesis testing and confidence intervals.

Data Visualization
Using tools like Matplotlib, Seaborn, and Plotly to create visual representations of data for insights.

Machine Learning
A subset of AI that enables systems to learn from data and improve without being explicitly programmed.

Model Evaluation
Assessing model performance using metrics like accuracy, precision, recall, and F1 score.

Dimensionality Reduction
Techniques like PCA and t-SNE used to reduce the number of features while preserving essential information.

Data Ethics
Practicing responsible use of data by considering privacy, fairness, and transparency in analysis and modeling.

SQL for Data Science
Using SQL to query, filter, and aggregate structured data from relational databases.

Big Data Tools
Frameworks like Hadoop and Spark used for processing and analyzing large-scale datasets efficiently.