Data Wrangling
Cleaning, transforming, and organizing raw data into a usable format.
Exploratory Data Analysis (EDA)
Analyzing datasets to summarize key characteristics using statistics and visualizations.
Data Visualization
Using visual tools like graphs and charts to communicate data insights.
Statistics
Understanding mean, median, standard deviation, distributions, and hypothesis testing.
Probability
Foundation of inferential statistics and machine learning – deals with uncertainty in data.
Machine Learning
Training algorithms to find patterns in data and make predictions or decisions.
Feature Engineering
Creating new features or transforming existing ones to improve model performance.
Model Evaluation
Using metrics like accuracy, precision, recall, and ROC-AUC to assess model performance.
SQL for Data
Querying databases to extract, manipulate, and analyze structured data efficiently.
Big Data Tools
Technologies like Hadoop, Spark, and Hive used to process large-scale data.
Python & R
Popular programming languages for data analysis, visualization, and modeling.
Data Ethics
Ensuring data privacy, fairness, and transparency in data science projects and models.