FinApprove - Loan Approval Prediction

A machine learning project that predicts whether a loan application will be approved or not, based on applicant financial and demographic data.

About the Project

FinApprove uses supervised classification algorithms to analyze loan application data and predict approval outcomes. The project covers the full ML pipeline — from data cleaning and exploratory analysis to model training, evaluation, and feature engineering.

Dataset

File: loan_approval_data.csv

The dataset includes the following features:

Feature	Description
Applicant_ID	Unique identifier for each applicant (dropped before training)
Gender	Gender of the applicant
Marital_Status	Marital status of the applicant
Education_Level	Education background
Employment_Status	Whether the applicant is employed
Employer_Category	Type of employer
Applicant_Income	Monthly income of the applicant
Coapplicant_Income	Monthly income of the co-applicant
Credit_Score	Credit score of the applicant
DTI_Ratio	Debt-to-Income ratio
Savings	Savings amount
Loan_Purpose	Purpose of the loan
Property_Area	Area type of the property
Loan_Approved	Target variable (Yes / No)

Project Workflow

1. Data Loading

Loaded dataset using pandas

2. Handling Missing Values

Numerical columns filled using mean imputation
Categorical columns filled using most frequent value imputation

3. Exploratory Data Analysis (EDA)

Class distribution of loan approval (pie chart)
Gender and education level distribution (bar plots)
Income distribution for applicant and co-applicant (histograms)
Outlier analysis using box plots (Income, Credit Score, DTI Ratio, Savings)
Relationship between Credit Score and Loan Approval

4. Encoding

LabelEncoder applied to Education_Level and Loan_Approved
OneHotEncoder (drop first) applied to: Marital_Status, Employment_Status, Loan_Purpose, Gender, Employer_Category, Property_Area

5. Correlation Analysis

Heatmap generated to identify relationships between features
Top correlated features with Loan_Approved identified

6. Feature Scaling

StandardScaler applied to training and test sets

7. Model Training and Evaluation

Three classification models were trained and evaluated:

Model	Metrics
Logistic Regression	Accuracy, Precision, Recall, F1, Confusion Matrix
K-Nearest Neighbors (K=5)	Accuracy, Precision, Recall, F1, Confusion Matrix
Gaussian Naive Bayes	Accuracy, Precision, Recall, F1, Confusion Matrix

8. Feature Engineering

Added squared features: DTI_Ratio_sq and Credit_Score_sq
Original DTI_Ratio and Credit_Score columns dropped after engineering
Gaussian Naive Bayes retrained on engineered features

Tech Stack

Language: Python 3
Notebook: Jupyter Notebook
Libraries:
- pandas — data manipulation
- numpy — numerical operations
- matplotlib & seaborn — data visualization
- scikit-learn — preprocessing, model training, and evaluation

Installation and Setup

Clone the repository

git clone https://github.com/your-username/FinApprove.git
cd FinApprove

Install required libraries

pip install pandas numpy matplotlib seaborn scikit-learn

Place the dataset in the project folder

Make sure loan_approval_data.csv is in the same directory as the notebook.

Run the notebook

jupyter notebook FinApprove.ipynb

Project Structure

FinApprove/
│
├── FinApprove.ipynb          # Main Jupyter Notebook
├── loan_approval_data.csv    # Dataset (add manually)
└── README.md                 # Project documentation

Models Used

Logistic Regression — baseline linear classifier
K-Nearest Neighbors — distance-based classifier
Gaussian Naive Bayes — probabilistic classifier, also tested with engineered features

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.virtual_documents		.virtual_documents
FinApprove.ipynb		FinApprove.ipynb
LICENSE		LICENSE
README.md		README.md
loan_approval_data.csv		loan_approval_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FinApprove - Loan Approval Prediction

About the Project

Dataset

Project Workflow

1. Data Loading

2. Handling Missing Values

3. Exploratory Data Analysis (EDA)

4. Encoding

5. Correlation Analysis

6. Feature Scaling

7. Model Training and Evaluation

8. Feature Engineering

Tech Stack

Installation and Setup

Project Structure

Models Used

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FinApprove - Loan Approval Prediction

About the Project

Dataset

Project Workflow

1. Data Loading

2. Handling Missing Values

3. Exploratory Data Analysis (EDA)

4. Encoding

5. Correlation Analysis

6. Feature Scaling

7. Model Training and Evaluation

8. Feature Engineering

Tech Stack

Installation and Setup

Project Structure

Models Used

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages