🧠 Data Science Portfolio — Lewis

Turning raw data into insight, one notebook at a time.

👋 About This Repository

Welcome to my personal Data Science learning and project hub. This repository documents my journey through data analysis, machine learning, statistical modeling, and real-world problem-solving with data.

Whether you're here to learn, collaborate, or explore — make yourself at home.

📂 Repository Structure

Data-Science/
│
├── 📊 EDA/                    # Exploratory Data Analysis notebooks
│   ├── titanic_eda.ipynb
│   ├── world_happiness_eda.ipynb
│   └── retail_sales_eda.ipynb
│
├── 🤖 Machine-Learning/       # Supervised & unsupervised ML projects
│   ├── house_price_prediction/
│   ├── customer_churn/
│   └── spam_classifier/
│
├── 📈 Visualization/          # Charts, dashboards, and storytelling
│   ├── matplotlib_showcase.ipynb
│   └── plotly_interactive.ipynb
│
├── 🧹 Data-Cleaning/          # Messy data → clean data pipelines
│   └── cleaning_pipeline.ipynb
│
├── 📝 Notes/                  # Study notes & reference sheets
│   ├── statistics_101.md
│   ├── pandas_cheatsheet.md
│   └── sklearn_reference.md
│
└── README.md

⚠️ This structure is a roadmap — projects are added progressively.

🔥 Featured Projects

1. 🏠 House Price Prediction

Goal: Predict housing prices using regression models.
Tools: pandas, scikit-learn, matplotlib, seaborn
Highlights:

Feature engineering on 80+ columns
Compared Linear Regression, Ridge, Lasso, and XGBoost
Final RMSE: ~18,000 (top 15% Kaggle score)

2. 📉 Customer Churn Analysis

Goal: Identify customers likely to cancel their subscription.
Tools: pandas, sklearn, imbalanced-learn, SHAP
Highlights:

Handled severe class imbalance with SMOTE
Random Forest + SHAP for explainability
Precision: 87% | Recall: 82%

3. 🌍 World Happiness EDA

Goal: Deep dive into factors driving happiness across nations.
Tools: pandas, plotly, seaborn, statsmodels
Highlights:

Correlation heatmaps and regression analysis
Interactive choropleth world map
Insight: GDP per capita explains ~63% of happiness variance

4. 📧 Spam Classifier

Goal: Binary classification of emails as spam or not spam.
Tools: sklearn, NLTK, TF-IDF, Naive Bayes
Highlights:

Full NLP pipeline: tokenization → vectorization → classification
Accuracy: 98.4% on test set
False positive rate kept below 1%

🛠️ Tech Stack

Category	Tools
Languages	Python 3.x
Data Manipulation	pandas, NumPy
Visualization	matplotlib, seaborn, Plotly
Machine Learning	scikit-learn, XGBoost, LightGBM
NLP	NLTK, spaCy, TF-IDF
Statistics	statsmodels, SciPy
Notebooks	Jupyter, Google Colab
Version Control	Git, GitHub

📚 Learning Path

Here's the roadmap I'm following to level up:

🚀 Getting Started

Clone the repo and install dependencies:

git clone https://github.com/CreepyLewis/Data-Science.git
cd Data-Science
pip install -r requirements.txt

Open any notebook with Jupyter:

jupyter notebook

Or open directly in Google Colab by clicking the badge at the top of each notebook.

📦 requirements.txt

numpy
pandas
matplotlib
seaborn
plotly
scikit-learn
xgboost
lightgbm
nltk
spacy
statsmodels
scipy
imbalanced-learn
shap
jupyter

🤝 Contributing

This is a personal portfolio repo, but PRs and issues are very welcome!

🐛 Found a bug in a notebook? Open an issue.
💡 Have a dataset or project idea? Drop it in Discussions.
🌟 Liked the work? Give it a star — it helps a lot!

📬 Contact

Platform	Link
GitHub	@CreepyLewis
Email	coming soon
LinkedIn	coming soon

📄 License

This project is licensed under the MIT License — feel free to use, remix, and build on it.

Made with 💻, ☕, and a lot of .head() calls.

⭐ Star this repo if you find it useful! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
README.md		README.md
portfolio.html		portfolio.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Data Science Portfolio — Lewis

👋 About This Repository

📂 Repository Structure

🔥 Featured Projects

1. 🏠 House Price Prediction

2. 📉 Customer Churn Analysis

3. 🌍 World Happiness EDA

4. 📧 Spam Classifier

🛠️ Tech Stack

📚 Learning Path

🚀 Getting Started

📦 requirements.txt

🤝 Contributing

📬 Contact

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Data Science Portfolio — Lewis

👋 About This Repository

📂 Repository Structure

🔥 Featured Projects

1. 🏠 House Price Prediction

2. 📉 Customer Churn Analysis

3. 🌍 World Happiness EDA

4. 📧 Spam Classifier

🛠️ Tech Stack

📚 Learning Path

🚀 Getting Started

📦 requirements.txt

🤝 Contributing

📬 Contact

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages