Exploratory data analysis and visualizations of current data science job market using R.
Final project for the EDAV class. (Columbia University, Fall 2018).
Author: Yu Han Huang (yh3093), Deepak Ravishankar (dr2998), Jong Hyuk Lee (jl5261)
As students current majoring in Data Science at Columbia, we are interested in the swelling demand for data scientists coupled with the evident skills gap. We want to understand the demographic of the current data science job market: who is in the market, what skills are needed in the market, and what prospects can we expect from the market. For each variable, we would evaluate it from both demand and supply, that is, we would not only evaluate the demanded requirements from job listings, but how people in the industry are meeting the requirements.
The questions we are interested in are:
- Who’s in the data science job market: Age, Gender, Country, Education, Industry Backgrouds
- What skills are in the data science job market (both demand and supply): General skills, Technical skills
- What prospects can we expect from the market: Prospective salary
Three datasets are used in this study, two representing the supply side and one representing the demand side, including:
- Stack Overflow 2018 Developer Survey (https://insights.stackoverflow.com/survey)
- Kaggle ML and Data Science Survey, 2018 (https://www.kaggle.com/kaggle/kaggle-survey-2017#multipleChoiceResponses.csv)
- Rachel’s Mail - Columbia University Data Science Career Opportunities
- Analysis.RMD : R Markdown file of analysis with description
- Insights Of The Current Data Science Job Market.pdf : Final report of findings