Skip to content

PranavRoy07/CodeAlpha_DataAnalytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeAlpha Data Analytics Internship Project

Overview

This repository contains the tasks completed as part of the CodeAlpha Data Analytics Internship. The project demonstrates practical skills in data collection, data analysis, visualization, and sentiment analysis using Python.


Tasks Completed

Task 1: Web Scraping

  • Extracted data from a public website using Python and BeautifulSoup.
  • Collected book information such as title, price, rating, and availability.
  • Created a custom dataset and stored it in CSV format.

Files:

  • web_scraping_books.py
  • books_data.csv

Task 2: Exploratory Data Analysis (EDA)

  • Performed data exploration to understand structure and data types.
  • Cleaned and prepared data for analysis.
  • Identified patterns, trends, and anomalies using statistics and visuals.
  • Asked meaningful questions and validated assumptions.

File:

  • eda_books.ipynb

Task 3: Data Visualization

  • Created multiple visualizations using Matplotlib and Seaborn.
  • Designed charts to clearly communicate insights.
  • Explained each visualization and crafted a data story to support decision-making.

File:

  • eda_books.ipynb

Task 4: Sentiment Analysis

  • Performed sentiment analysis on the IMDB Movie Reviews dataset.
  • Analyzed movie reviews and classified sentiments as positive, negative, or neutral.
  • Visualized sentiment distribution and compared predicted vs actual labels.
  • Extracted insights to understand audience perception.

Dataset Source: IMDB Movie Reviews Dataset (Kaggle) https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews

Note: The dataset file is not included in this repository due to GitHub file size limitations.

Files:

  • sentiment_analysis_imdb.ipynb
  • IMDB Dataset.csv

Tools and Technologies Used

  • Python
  • Pandas
  • NumPy
  • BeautifulSoup
  • Matplotlib
  • Seaborn
  • TextBlob
  • Jupyter Notebook
  • VS Code

Conclusion

This project provided hands-on experience in real-world data analytics tasks, including data collection, analysis, visualization, and sentiment analysis. The completed tasks demonstrate the ability to extract insights from data and present them in a meaningful and understandable way.

About

Data analytics internship project: web scraping with BeautifulSoup, exploratory data analysis, visualization, and NLP sentiment analysis on the IMDB reviews dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors