Hayes Chow
Oct 18, 2021
The purpose of this presentation is to introduce you to the Step Interval Analysis application.
The application will allow you to explore the activity data, collected through a personal monitoring device, in an interactive format.
The application contains various plots, with different input methods to control formatting and plots, as well as table summaries and data.
library(ggplot2)
library(plotly)
library(curl)
Download files from Reproducible Research Course site https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip.
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip"
zip <- curl_download(url, "repdata_data_activity.zip")
csv <- unzip(zip, "activity.csv")
df <- read.csv(csv, header = TRUE)
#calc mean interval steps
mean_interval_steps <- aggregate(steps ~ interval, df, mean)
# merge mean interval steps with data frame to create new data frame
df2 <- merge(df, mean_interval_steps, by = "interval", all.x = TRUE)
# replace NAs with mean interval steps value
df2$steps <- ifelse(is.na(df2$steps.x), df2$steps.y, df2$steps.x)
# create new column for days of the week
df2$days <- weekdays(as.Date(df2$date))
# group days into weekend/weekday
df2$days <- replace(df2$days, df2$days %in% c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday"), "weekday")
df2$days <- replace(df2$days, df2$days %in% c("Saturday", "Sunday"), "weekend")
# calc mean for each interval and day
mean_df2a <- aggregate(steps ~ interval + days, df2, mean)
# create a factor variable to order panel plots by weekend/weekday
mean_df2a$days_f <- factor(mean_df2a$days, levels=c("weekend", "weekday"))
#histogram
total_df2a <- aggregate(steps ~ date, df2, sum)
hist(total_df2a$steps, main = "Total number of steps taken each day")
# plot
ggplot(data=mean_df2a,
aes(x=interval, y=steps, colour=days_f)) +
geom_line() + ggtitle("Average Daily Steps") + theme(plot.title = element_text(hjust = 0.5))
Thanks for your time and consideration of the Step Interval Analysis application!