Skip to content

Latest commit

 

History

History
97 lines (71 loc) · 2.53 KB

PA1_template.md

File metadata and controls

97 lines (71 loc) · 2.53 KB

Peer Assignment 1

setwd("C:/Users/ece.o/OneDrive/Coursera/Johns Hopkins University Data Science/Reproducible Research/RepData_PeerAssessment1")

Loading and preprocessing the data

data <- read.csv("activity.csv")

What is mean total number of steps taken per day?

  1. Total number of steps taken per day
dailySteps <- tapply(data[,1], data[,2], sum, na.rm = TRUE)
  1. Histogram of the total number of steps taken each day
hist(dailySteps)

plot of chunk unnamed-chunk-4

  1. Mean of the total number of steps taken per day is 9354 and median is 10395

What is the average daily activity pattern?

  1. Time series plot of the 5-minute interval and the average number of steps taken, averaged across all days
intSteps <- tapply(data[,1], data[,3], mean, na.rm = TRUE)
plot(names(intSteps), intSteps, type = "l", xlab = "5-Minute Interval", ylab = "# of Steps")

plot of chunk unnamed-chunk-5

  1. Interval 835 contains the maximum number of steps.

Imputing missing values

  1. The total number of rows with NAs is 2304.
  2. Fill missing data with interval mean
filled <- data
for (i in 1:nrow(filled)) {
  if (is.na(data[i,1])) {
    filled[i, 1] = intSteps[names(intSteps) == filled[i, 3]]
  }
}
dailySteps2 <- tapply(filled[,1], filled[,2], sum)
hist(dailySteps2)

plot of chunk unnamed-chunk-7

Mean of the total number of steps taken per day is 10766 and median is 10766, too.

The frequencies in the middle part of the histogram increased because the empty values are now replaced with average values. The mean and median are slightly higher, but not too high that can distort the analysis in a bad way.

Are there differences in activity patterns between weekdays and weekends?

  1. Convert to PosIX first, then apply weekdays().
filled$weekday <- as.factor(weekdays(as.POSIXct(filled[,2])))
filled$weekDE <- ifelse((filled[,4] == "Saturday") | (filled[,4] == "Sunday"), "weekend", "weekday")
filled$weekDE <- as.factor(filled$weekDE)
  1. Group the data using data.table.
library(data.table)
filled <- data.table(filled)
weekMean <- filled[, lapply(.SD, mean), by = "interval,weekDE"]

Create line chart.

library(lattice)
xyplot(steps ~ interval|weekDE,
     data = weekMean,
     type = "l",
     xlab = "Interval",
     ylab = "Number of steps",
     layout=c(1,2))

plot of chunk unnamed-chunk-10