Citrics

You can find the project Here. The API Docs for the DS portion can be found here Here

High level overview presentation.

Deep dive into cleaning the data.

Datasets used

Contributors:

Scott Maxwell	Matthew Sessions	Luke Townsend	jimmy 'Zeb' Smith

Steven Reiss	Stephanie Miller	Amy NLe	Robert Tom

Project Overview

Citrics provides statistics on 28,925 different locations in the United States that are available for viewing. This was created with a team of web developers and data engineers. These statistics include information about housing prices, employment, industry, lifestyle and much more, sources are listed below.

Links to team documents:

Trello Board

Product Canvas

Live Front End/URL

Tech Stack

Python
Flask
Docker
Jupyter Notebooks
Mongo DB
AWS Elastic Beanstalk/Amplify/S3/Route 53
AWS PostgreSQL

Predictions

The following models are using a K-Nearest Neighbors KD-Tree algorithm from the Scikit-Learn Python Library

Clicking on the links for each model will take you to the .py file for that model within this repo

Housing Model:

Features & Metrics Used:

Median Rent
Occupants per room
Housing by bedrooms
Vacancy Rate
Rent Pricing
Historical Property Value
Historical Property Value Growth by %

Industry Model:

Features & Metrics Used:

Industry Types
Health insurance
Salary
Commute & travel time
Retirement
Unemployment

Culture Model:

Features & Metrics used:

Education
Language
Ethnicity
Birth Rate
Population

Reverse User:

A recommendation questionnaire that supplies the user with a recommended city Features & Metrics used:

Population
Income
Monthly Housing
Temperature Preference
Industry*

*Note that Industry is optional on the website, due to lack of adequate data for all cities within the database.

Time series-housing:

Features & Metrics used:

Education
Language
Ethnicity
Birth Rate
Population

Time series-Industry:

Features & Metrics used:

Education
Language
Ethnicity
Birth Rate
Population

Note for Library Conflicts:

AWS' Elastic Beanstalk has a hard time runing Numpy and Scipy. These libraries power Sklearn. Also, the joblib library had a hard time running models that were trained on different operating systems. Once we found models that worked, we exported the code to a python script and ran it on a Linux based machines running Python 3.6. We then used Docker/Dockerhub to contain and ship our flask app, and then connected it to AWS to test in Elastic Beanstalk. These steps allowed us to seamlessly deploy predictive models.

Name		Name	Last commit message	Last commit date
Latest commit History 210 Commits
Model-master		Model-master
data-collection-master		data-collection-master
flask-docker-master		flask-docker-master
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
code_of_conduct.md		code_of_conduct.md
pull_request_template.md		pull_request_template.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Citrics

High level overview presentation.

Deep dive into cleaning the data.

Datasets used

Contributors:

Project Overview

Links to team documents:

Tech Stack

Predictions

The following models are using a K-Nearest Neighbors KD-Tree algorithm from the Scikit-Learn Python Library

Housing Model:

Industry Model:

Culture Model:

Reverse User:

Time series-housing:

Time series-Industry:

Data Sources

Python Notebooks

Fixing City Names:

Different types of data and sources

Housing Data

Models (for suggesting similar cities):

Models (Time series):

Model (Questionnaire):

How to connect to the data API

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Citrics

High level overview presentation.

Deep dive into cleaning the data.

Datasets used

Contributors:

Project Overview

Links to team documents:

Tech Stack

Predictions

The following models are using a K-Nearest Neighbors KD-Tree algorithm from the Scikit-Learn Python Library

Housing Model:

Industry Model:

Culture Model:

Reverse User:

Time series-housing:

Time series-Industry:

Data Sources

Python Notebooks

Fixing City Names:

Different types of data and sources

Housing Data

Models (for suggesting similar cities):

Models (Time series):

Model (Questionnaire):

How to connect to the data API

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages