Skip to content

gracemshea/ga_da_finalproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Best Places for Outdoor Enthusiasts

Table of Contents

  1. Introduction
  2. Excel
  3. SQL
  4. Tableau

alt text

“Keep close to Nature’s heart… and break clear away, once in a while, and climb a mountain or spend a week in the woods. Wash your spirit clean.” –John Muir

Introduction

This project, which performs a ranking of the best American cities for outdoors enthusiasts, showcases my SQL skills using a combination of data sources. These files consist of Zillow's Home Value Index, U.S. city population estimates, NOAA's registered recreational fishing catches for 2017, as well as the locations of city bike lanes, campgrounds, climbing routes, and ski resorts across the United States.

From data collection to the final product, I utilized Excel to explore, clean and format the raw data, SQL to load, organize and transform it within Postgres, and Tableau to analyze the data and enable user input for an individualized final ranking of locations.

alt text

Excel

Data Exploration Process

When making sense of potential data sources, Pivot Tables proved enormously useful in understanding the data quickly.

Pivot Tables

  • Example 1: When exploring the directory of ski resorts, which I'd narrowed down to the US only, I was interested in how altitude filtered the states apart. By creating a quick pivot table, I was able to select the altitude values of 3000+ and see which states had resorts (and how many) within that range of altitudes. See pivot table here.

  • Example 2: The Zillow Home Value Index (ZHVI) is an important piece of data in understanding home values across a state, or county. I wanted to see, based on Zillow's data from 2018, which state's average ZHVI values were highest. See pivot table here.

Data Cleaning Process

When cleaning the datasets, I eliminated columns of data unnecessary to my project. For example, in the fish catch dataset, there were many columns that were not important to my question; Where were people catching fish recreationally in the U.S. and what were the catch totals? However, not all data cleaning comes from elimination, but rather adding columns to make the data easier to understand.

VLOOKUP

  • Example 1: The fishing data referred to the US states by numbers, using the FIPS (Federal Information Processing Standard) state codes. I added a second tab with these codes and used a VLOOKUP function to fill in the state names into the dataset. See here.

  • Example 2: The camping data referred to the campsite type with a code. I found these codes on the main campground info website, and created a second tab in Excel with the codes and corresponding meanings. By using VLOOKUP, I integrated the campground types into the dataset. See here.

Index & Match

  • Example 1: This Index & Match allows for the exploration of population, by typing in the city and state. See here.

alt text

Lesson Learned: During the data cleaning process, I discovered that the fishing data source was not representative of the entire country, and have so removed it from the data to be used in the visualization.

SQL

Data Loading

There are 3 methods I explored for loading data into pgAdmin & Postgres:

  1. Copy File Path
  2. Temp Table
  3. CSV to SQL Converter

Ultimately, I found that having the information embedded in the code, using the third method, resolved my permission issues. You can find all the SQL data-loading files here.

SQL Commands & Operators

A wide variety of SQL statements were utilized throughout this project. Click each command or operator to navigate to an example of the most used commands.

  • COUNT() provides a count of rows where the value is not NULL.
  • CREATE TABLE establishes a new table within a schema.
  • FULL JOIN returns all records when there is a match in either left or right tables.
  • GROUP BY arranges returned data within aggregate functions into groups.
  • INSERT adds new rows to a table.
  • ORDER BY used to sort the results, paired with a column for an ascending (ASC) or descending (DESC) order.
  • OUTER JOIN returns unmatched rows in one or both tables.

alt text

Tableau

Please visit Tableau Public here to view my data visualization.

The key focus of my project is center the data visualization around user engagement. The maps and charts respond, due to the use of calculated fields, to the user's choice in parameter dropdown and a filter ranges.

Users can narrow the following aspects of the data to narrow down and rank the states:

  • Average Temperature
  • Average Precipitation
  • Priority Level of Recreational Activities from "Very Important", "Somewhat Important", and "Not Important" of the following:
    • Biking
    • Camping
    • Climbing
    • Skiing

Users can narrow the following aspects of the data to narrow down and explore the cities:

  • Median Home Value
  • Population

This concludes my project. I appreciate any suggestions to further improve.

“Afoot and lighthearted I take to the open road, healthy, free, the world before me.” –Walt Whitman

alt text

About

Grace Shea's Final Project Repo for DA class @ Galvanize

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published