STA 032 - R HW 1

Problem I

(a) The statistics for number of steps per day are…

The mean:

## [1] 9622.533

The median:

## [1] 9385.5

The standard deviation:

## [1] 3736.228

(b) The statistics for total hours of sleep per day are…

The mean:

## [1] 7.402049

The median:

## [1] 8.02

The standard deviation:

## [1] 2.097509

(c) The quartiles for the total miles traveled per day are:

Q1=

##    25% 
## 2.9925

Q2=

##   50% 
## 4.105

Q3=

##  75% 
## 5.27

(d) The quartiles for the total number of steps per day are:

Q1=

##     25% 
## 6809.25

Q2=

##    50% 
## 9385.5

Q3=

##      75% 
## 12065.25

Problem II

(a) The summary statistics for total hours of sleep per day are

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   7.222   8.020   7.402   8.528   9.780

(b) The cutoffs for outliers are

Q1=

##    25% 
## 7.2225

and Q3=

##    75% 
## 8.5275

(c) The standard deviation for the total miles traveled per day is

## [1] 1.631983

which would be the typical deviation of an amount of miles per day from the mean of the dataset.

(d) The coefficient of variation for the total miles traveled per day is

## [1] 38.76822

which is the percentage that the standard deviation is of the mean, typically showing the accuracy of the data.

Problem III

(a) Boxplot of Steps per Day

It appears that Tuesday has a relatively lower range of steps, however its range does also extend higher than some of the other days. On average, it also looks like Wednesday has the lowest mean, meaning it would typically be the day with the least amount of steps. Even though Tuesday can have days fewer steps than Wednesday, it is more likely for Wednesday to be lower based on the data.

(b) Boxplot of Sleep per Day

It looks as though Sunday typically allows for more sleep because it appears to have the highest mean, along with the highest range of hours of sleep.

(c) Histogram for Daily amount of miles traveled

The data seems pretty symmetric, as it forms almost a perfect bell curve. The higher and lower amount of miles have a lower frequency, as the middle amounts have a higher frequency, causing quite symmetric data and therefore plot.

(d) Histogram for Daily amount of sleep

While amount of sleep has somewhat of a bell-shaped curve, it is no where near as symmetrical as the miles traveled plot. Its range is much smaller and has certain amounts with much greater frequencies along with some random outliers with high frequency, causing slightly more skewed data.

APPENDIX OF CODE

#fitbit=read.csv("~/Desktop/ErinsFitbit.csv")
mean(fitbit$Steps)
median(fitbit$Steps)
sd(fitbit$Steps)
mean(fitbit$Asleep)
median(fitbit$Asleep)
sd(fitbit$Asleep)
quantile(fitbit$Distance, prob = c(0.25))
quantile(fitbit$Distance, prob = c(0.50))
quantile(fitbit$Distance, prob = c(0.75))
quantile(fitbit$Steps, prob = c(0.25))
quantile(fitbit$Steps, prob = c(0.50))
quantile(fitbit$Steps, prob = c(0.75))
summary(fitbit$Asleep)
quantile(fitbit$Asleep, prob = c(0.25))
quantile(fitbit$Asleep, prob = c(0.75))
sd(fitbit$Distance)
cv <- function(mean, sd){(sd/mean)*100}
cv(mean = mean(fitbit$Distance), sd = sd(fitbit$Distance))
boxplot(Steps~Day, data=fitbit, main="Steps per Day", xlab="Day of the Week", ylab="Number of Steps")
boxplot(Asleep~Day, data=fitbit, main="Sleep per Day", xlab="Day of the Week", ylab="Hours of Sleep")
hist(fitbit$Distance, main = "Miles Traveled Daily", xlab = "Distance in Miles")
hist(fitbit$Asleep, main = "Daily Amount of Sleep", xlab = "Amount of Sleep in Hours")

STA 032 - R HW 1

Kaitlin Campbell

Problem I

(a) The statistics for number of steps per day are…

The mean:

The median:

The standard deviation:

(b) The statistics for total hours of sleep per day are…

The mean:

The median:

The standard deviation:

(c) The quartiles for the total miles traveled per day are:

Q1=

Q2=

Q3=

(d) The quartiles for the total number of steps per day are:

Q1=

Q2=

Q3=

Problem II

(a) The summary statistics for total hours of sleep per day are

(b) The cutoffs for outliers are

Q1=

and Q3=

(c) The standard deviation for the total miles traveled per day is

which would be the typical deviation of an amount of miles per day from the mean of the dataset.

(d) The coefficient of variation for the total miles traveled per day is

which is the percentage that the standard deviation is of the mean, typically showing the accuracy of the data.

Problem III

(a) Boxplot of Steps per Day

(b) Boxplot of Sleep per Day

It looks as though Sunday typically allows for more sleep because it appears to have the highest mean, along with the highest range of hours of sleep.

(c) Histogram for Daily amount of miles traveled

The data seems pretty symmetric, as it forms almost a perfect bell curve. The higher and lower amount of miles have a lower frequency, as the middle amounts have a higher frequency, causing quite symmetric data and therefore plot.

(d) Histogram for Daily amount of sleep

While amount of sleep has somewhat of a bell-shaped curve, it is no where near as symmetrical as the miles traveled plot. Its range is much smaller and has certain amounts with much greater frequencies along with some random outliers with high frequency, causing slightly more skewed data.

APPENDIX OF CODE