Skip to content

Ssaadia/Exploratory-Data-Analysis

Repository files navigation

Exploratory Data Analysis

Professional EDA Portfolio

This repository presents a curated portfolio of exploratory data analysis projects developed to demonstrate practical data investigation, cleaning, visualization, pattern discovery, and insight communication using Python.

The projects cover different domains including health risk analysis, income and demographic profiling, smartphone market analysis, and cricket performance analytics. Each project is designed to show how raw data can be transformed into structured insights that support analytical reasoning, decision making, and portfolio level storytelling.


Repository Focus

This repository demonstrates core exploratory data analysis capabilities, including:

  • data loading and inspection,
  • data cleaning and preprocessing,
  • missing value review,
  • categorical and numerical feature analysis,
  • grouped summaries and aggregations,
  • trend and distribution analysis,
  • correlation and relationship exploration,
  • visual storytelling,
  • and analytical conclusion writing.

Projects Included

Project Folder Domain Analytical Focus
01_stroke_risk_analysis Health Analytics Stroke risk factors, patient characteristics, and health related patterns
02_adult_income_and_demographic_analysis Socioeconomic Analytics Income distribution, demographic profiling, education, occupation, and work related patterns
03_smart_phone_market_analysis Market Analytics Smartphone pricing, specifications, performance features, and value segmentation
04_cricket_performance_analysis Sports Analytics Team performance, wins, toss impact, rankings, and comparative cricket insights

Technical Stack

Category Tools and Libraries
Programming Python
Data Analysis pandas, NumPy
Visualization Matplotlib, Seaborn
Notebook Environment Jupyter Notebook, VS Code
Version Control Git, GitHub

Repository Structure

Exploratory-Data-Analysis/
├── 01_stroke_risk_analysis/
├── 02_adult_income_and_demographic_analysis/
├── 03_smart_phone_market_analysis/
├── 04_cricket_performance_analysis/
├── .gitignore
└── README.md


Methodological Approach

The projects generally follow a structured EDA workflow:

Define the analytical context and project objective
Load and inspect the dataset
Review data structure, dimensions, and feature types
Identify missing values, duplicates, and data quality issues
Clean and transform variables where required
Analyze numerical and categorical features
Generate grouped summaries and comparative views
Create visualizations to identify patterns and relationships
Interpret findings in plain analytical language
Prepare conclusions and possible next steps
Portfolio Value

This repository is designed to demonstrate applied analytical skills relevant to:

Role Type	Skills Demonstrated
Data Analyst	Data cleaning, EDA, visualization, and insight generation
Business Intelligence Analyst	Aggregation, segmentation, and reporting ready analysis
Junior Data Scientist	Feature understanding, pattern discovery, and model preparation
Research Analyst	Evidence based interpretation and structured analytical reporting
Sports or Market Analyst	Domain focused exploratory analysis and comparative insights
Key Strengths Demonstrated
Ability to work across multiple datasets and domains
Structured notebook based analytical workflows
Clear use of pandas for data manipulation and grouping
Visual analysis using Python plotting libraries
Practical interpretation of patterns and relationships
Preparation of datasets and insights for dashboards or further modeling
Consistent project organization for portfolio presentation
Data Management Note

Large raw datasets and temporary files are intentionally excluded where necessary to keep the repository lightweight and suitable for public portfolio review.

The repository focuses on analytical process, cleaned workflows, visual outputs, and insight communication rather than storing unnecessary large files.

Future Enhancements

Planned improvements may include:

Power BI dashboard versions of selected projects
More polished executive summaries inside each project folder
Additional visual storytelling outputs
Feature engineering extensions for machine learning
Cleaned datasets prepared for dashboard consumption
Project level README files for each analysis

Author
Samina Saadia
Senior IT and Data Analytics Professional
Python | SQL | Power BI | Machine Learning | Data Analytics | Business Intelligence

This repository is part of a broader analytics portfolio focused on data science, exploratory analysis, machine learning, and evidence based decision support.

About

Exploratory data analysis portfolio covering health risk, income demographics, smartphone market trends, and cricket performance analytics using Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors