Electric Vehicle Data Analysis This project focuses on analysing Electric Vehicle (EV) datasets using Python. The aim is to gain insights into EV adoption trends, performance, and statistical patterns through data cleaning, filtering, visualisation, and hypothesis testing.
~ Features Data filtering and preprocessing Outlier detection and handling Statistical analysis & hypothesis testing Visualisation with Matplotlib and Seaborn
~ Tech Stack Python pandas – data manipulation matplotlib / seaborn – data visualisation scipy / numpy – statistical analysis
📂 Project Structure ├Machine Learning & NLP Project Portfolio A collection of three data-driven projects covering Machine Learning, Natural Language Processing, and Statistical Data Analysis, implemented using Python. Each project demonstrates end-to-end workflows including data preprocessing, feature engineering, model development, evaluation, and insights.
Part A: IMDb Movie Review Sentiment Analysis This project focuses on classifying IMDb movie reviews as positive or negative using Natural Language Processing. The goal is to analyze text reviews and build ML models that understand sentiment based on language patterns.
Overview Sentiment analysis is a key NLP task used in marketing, product reviews, and social media analytics. This project builds a classification model using TF-IDF, Bag-of-Words, and machine learning algorithms to predict sentiment.
Features Text cleaning & preprocessing Tokenization, stopword removal, stemming, lemmatization Feature extraction: BoW, TF-IDF ML models: Logistic Regression, Naive Bayes, SVM, Random Forest, LSTM/BERT (optional) Model evaluation using accuracy, precision, recall, F1-score Visualizations: word clouds, confusion matrix, bar graphs
Tech Stack Python, pandas, NumPy, scikit-learn, NLTK, spaCy, TensorFlow/Keras, Matplotlib, Seaborn
Part B: News Article Classification This NLP project classifies news articles into categories such as sports, politics, and technology.
Overview With large volumes of digital news, automated classification improves content organization and recommendations. The goal is to preprocess news text and build ML models to categorize articles accurately.
Features Text cleaning & preprocessing TF-IDF / Bag-of-Words / Embeddings (Word2Vec, GloVe) Exploratory data analysis (EDA) Classification models: Logistic Regression, Naive Bayes, SVM Cross-validation Performance metrics (Accuracy, F1-score, Precision, Recall) Visualization of results
Tech Stack Python, pandas, scikit-learn, NLTK, Matplotlib, Seaborn
Electric Vehicle (EV) Data Analysis This project explores EV datasets to uncover trends in adoption, performance, and statistical behavior. It includes data cleaning, outlier detection, visualization, and hypothesis testing.
Overview The objective is to analyze EV data to understand how factors like vehicle type, range, and charging affect EV usage patterns.
Features Data filtering and preprocessing Outlier detection and handling Statistical tests (t-test, chi-square, correlations) Visualizations using Matplotlib & Seaborn Insights into EV adoption and performance
Tech Stack Python, pandas, NumPy, SciPy, Matplotlib, Seaborn 📂 Project Structure ├── data/ # Dataset(s) ├── notebooks/ # Jupyter notebooks ├── scripts/ # Python scripts ├── results/ # Graphs, plots, or reports └── README.md # Documentation
How to Run the Projects
- Clone the repository git clone https://github.com/Nehakp8842/Electric Vehicles Analysis.git
- Navigate to project folder cd ev-data-analysis
- Install dependencies pip install -r requirements.txt
- Run Jupyter notebooks jupyter notebook
Sample Outputs Sentiment prediction for IMDb movie reviews News article categorization results EV adoption trend graphs Outlier detection plots Hypothesis testing results
Contributions Contributions, issues, and feature requests are welcome! Feel free to fork the repo and submit a pull request.