Loading and preprocessing the data
1. Load the data (i.e.Β read.csv())
unzip("./activity.zip")
activityData <- read.csv("./activity.csv")
What is mean total number of steps taken per day?
1. Calculate the total number of steps taken per day
stepsPerDay <- aggregate(steps ~ date, data = activityData, sum, na.rm=TRUE)
2. If you do not understand the difference between a histogram and a barplot, research the difference between them. Make a histogram of the total number of steps taken each day
hist(stepsPerDay$steps)

What is the average daily activity pattern?
1. Make a time series plot (i.e.Β ππ’ππ = βπβ) of the 5-minute interval (x-axis) and the average number of steps taken, averaged across all days (y-axis)
stepsPerInterval <- aggregate(steps ~ interval, data = activityData, mean, na.rm=TRUE)
plot(steps ~ interval, stepsPerInterval, type="l")

2. Which 5-minute interval, on average across all the days in the dataset, contains the maximum number of steps?
intervalwithMaxNbSteps <- stepsPerInterval[which.max(stepsPerInterval$steps),]$interval
intervalwithMaxNbSteps
## [1] 835
Imputing missing values
1. Calculate and report the total number of missing values in the dataset (i.e.Β the total number of rows with NAs)
totalValuesMissing <- sum(is.na(activityData$steps))
totalValuesMissing
## [1] 2304
3. Create a new dataset that is equal to the original dataset but with the missing data filled in.
activityDataNoNA <- activityData
for(i in 1:nrow(activityDataNoNA)){if(is.na(activityDataNoNA[i,]$steps)){activityDataNoNA[i,]$steps <- getMeanStepsPerInterval(activityDataNoNA[i,]$interval)}}
Are there differences in activity patterns between weekdays and weekends?
1. Create a new factor variable in the dataset with two levels β βweekdayβ and βweekendβ indicating whether a given date is a weekday or weekend day.
activityDataNoNA$date <- as.Date(strptime(activityDataNoNA$date, format = "%Y-%m-%d"))
activityDataNoNA$day <- weekdays(activityDataNoNA$date)
for(i in 1:nrow(activityDataNoNA)){if(activityDataNoNA[i,]$day %in% c("Saturday", "Sunday")){activityDataNoNA[i,]$day <- "weekend"}else{activityDataNoNA[i,]$day <- "weekday"}}
stepsByDay <- aggregate(activityDataNoNA$steps ~ activityDataNoNA$interval + activityDataNoNA$day, data = activityDataNoNA, mean)
2. Make a panel plot containing a time series plot (i.e.Β ππ’ππ = βπβ) of the 5-minute interval (x-axis) and the average number of steps taken, averaged across all weekday days or weekend days (y-axis). See the README file in the GitHub repository to see an example of what this plot should look like using simulated data.
names(stepsByDay) <- c("interval", "day", "steps")
library(lattice)
xyplot(steps ~ interval | day, data = stepsByDay, type = "l", layout = c(1, 2), xlab = "Interval", ylab = "Average Number of Steps")
