This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Load the library .

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.2.3
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 3.2.3
## Loading required package: grid
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Warning: package 'Formula' was built under R version 3.2.3
## 
## Attaching package: 'Hmisc'
## 
## The following objects are masked from 'package:dplyr':
## 
##     combine, src, summarize
## 
## The following objects are masked from 'package:base':
## 
##     format.pval, round.POSIXt, trunc.POSIXt, units

Load the Activity data.

  activity<-read.csv("activity.csv")
  head(activity)
##   steps       date interval
## 1    NA 2012-10-01        0
## 2    NA 2012-10-01        5
## 3    NA 2012-10-01       10
## 4    NA 2012-10-01       15
## 5    NA 2012-10-01       20
## 6    NA 2012-10-01       25

What is mean total number of steps taken per day?

1- Calculate the total number of steps taken per day.

2- Make a histogram of the total number of steps taken each day.

ggplot(stepsByDay, aes(x=steps))+geom_histogram(color='red',fill='green')+
  scale_x_continuous("Total steps per day",limit=c(0,max(stepsByDay$steps)))
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

3-Calculate and report the mean and median of the total number of steps taken per day

mean(stepsByDay$steps,na.rm = TRUE)
## [1] 9354.23
median(stepsByDay$steps,na.rm = TRUE)
## [1] 10395

What is the average daily activity pattern?

average_step_interval <- aggregate(activity$steps,by=list(activity$interval),FUN=mean,na.rm=T)
names(average_step_interval)<-c("interval",'meanSteps')

1. Make a time series plot

ggplot(average_step_interval,aes(x=interval,y=meanSteps))+geom_line(color='red',fill='blue')+
  xlab("5 min Interval")+ylab("Mean Steps/Inteval")

2- Which 5-minute interval, on average across all the days in the data set, contains the maximum number of steps?

which.max(average_step_interval$meanSteps)
## [1] 104

Imputing missing values

1-Calculate and report the total number of missing values in the dataset (i.e. the total number of rows with NAs)

sum(is.na(activity$steps))
## [1] 2304

2- Devise a strategy for filling in all of the missing values in the dataset. The strategy does not need to be sophisticated. For example, you could use the mean/median for that day, or the mean for that 5-minute interval, etc.

3-Create a new dataset that is equal to the original dataset but with the missing data filled in.

activityDataImputed <- activity
activityDataImputed$steps <- impute(activity$steps, fun=mean)

4-Make a histogram of the total number of steps taken each day

stepsByDay<-aggregate(activityDataImputed$steps, by=list(activityDataImputed$date),FUN=sum,na.rm=T)
  names(stepsByDay)<-c("date",'steps')
ggplot(stepsByDay, aes(x=steps))+geom_histogram(color='red',fill='blue')+
  scale_x_continuous("Total steps per day",limit=c(0,max(stepsByDay$steps)))
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

###…Calculate and report the mean and median total number of steps taken per day.

mean(activityDataImputed$steps)
## [1] 37.3826
median(activityDataImputed$steps)
## [1] 0

Are there differences in activity patterns between weekdays and weekends?

1-Create a new factor variable in the dataset with two levels - “weekday” and “weekend” indicating whether a given date is a weekday or weekend day.

    activityDataImputed$dateType <- ifelse(as.POSIXlt(activityDataImputed$date)$wday %in% c(0,6),'weekend','weekday' )

2-Make a panel plot containing a time series plot

averagedActivityDataImputed <- aggregate(steps ~ interval + dateType, data=activityDataImputed, mean)
ggplot(averagedActivityDataImputed,aes(interval,steps))+geom_line(aes(color=dateType))+facet_grid(dateType ~.)