Data Science Course 3
-
The purpose of this project is to demonstrate the ability to collect, work with, and clean a data set. The goal is to prepare tidy data that can be used for later analysis. A code book called
CodeBook.mddescribes the variables, the data, and any transformations or work that were performed to clean up the data. -
One of the most exciting areas in all of data science right now is wearable computing. Companies like Fitbit, Nike, and Jawbone Up are racing to develop the most advanced algorithms to attract new users. The data linked to from the course website represent data collected from the accelerometers from the Samsung Galaxy S smartphone.
-
A full description is available at the following URL, where the data was obtained:
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
- The input data set can be downloaded from the following URL:
https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
run_analysis.R: this script takes the input data, and creates the output file
- The R script downloads and unzips the dataset from the above given URL.
- The test and training sets are read and merged.
- The mean and std features are extracted.
- The activity names for the activities are merged.
- A series of labeled columns to represent single variables from the feature are built up.
- The average of each variable is calculated and written into
tidy_data.txt.
The output is written into a tidy dataset called tidy_data.txt.
CodeBook.md describes the variables, the data, and any transformation or work performed to clean up the original data.