eCornell-census-workclass-model

A machine learning pipeline to predict a person’s employment category (workclass) using U.S. Census data.

This project focuses on predicting an individual's workclass (employment category) using data from the 1994 U.S. Census Income dataset. It applies a complete machine learning workflow including:

-Data cleaning and preprocessing (handling missing values, outliers, and encoding categorical variables)

-Exploratory data analysis (EDA) to understand feature relationships and distributions

-Feature engineering and winsorization

-Model selection and training using Decision Trees, Random Forest, and Gradient Boosted Decision Trees

-Hyperparameter tuning via GridSearchCV

-Evaluation with accuracy, confusion matrices, and feature importance

-The final model achieves approximately 80% accuracy and identifies the most predictive features influencing employment type. This could help government agencies or labor economists better understand workforce patterns.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
censusData.csv		censusData.csv
workclass-model.ipynb		workclass-model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

eCornell-census-workclass-model

About

Uh oh!

Releases

Packages

Languages

ariansbahram/eCornell-census-workclass-model

Folders and files

Latest commit

History

Repository files navigation

eCornell-census-workclass-model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages