date()
## [1] "Sat Dec 08 18:49:50 2012"
Due Date: December 7, 2012, 2pm
Total Points: 30; Points are given in parentheses.
(1) Use the FLprecip.txt file on my website http://myweb.fsu.edu/jelsner/ and peform a t-test to determine if Florida is significantly drier during May compared with August. (10)
# t-Test compares the means of two groups under the assumption that both
# samples are random, independent, and come from normal distribution with
# equal variance
PurpleRain <- read.table("http://myweb.fsu.edu/jelsner/FLprecip.txt", header = TRUE)
head(PurpleRain)
## Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1 1895 3.277 3.241 2.499 4.530 4.252 4.500 7.450 6.103 4.669 3.091
## 2 1896 3.928 3.020 2.570 0.498 2.700 11.228 8.217 5.892 4.352 2.959
## 3 1897 1.839 6.000 2.125 4.390 2.279 5.221 7.212 6.831 11.144 4.101
## 4 1898 0.704 2.009 1.259 1.320 1.509 3.292 8.947 13.090 5.231 5.877
## 5 1899 4.523 5.921 1.898 3.398 1.110 5.803 9.264 6.712 5.132 5.882
## 6 1900 3.207 4.369 6.800 4.317 3.891 9.993 7.501 4.492 4.930 5.230
## Nov Dec
## 1 2.649 1.586
## 2 3.516 2.071
## 3 1.749 2.680
## 4 2.190 3.891
## 5 0.751 1.939
## 6 1.221 4.290
t.test(PurpleRain$May, PurpleRain$Aug)
##
## Welch Two Sample t-test
##
## data: PurpleRain$May and PurpleRain$Aug
## t = -12.21, df = 219.2, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.882 -2.802
## sample estimates:
## mean of x mean of y
## 3.857 7.199
# p<.001. Very strong evidence against the null hypothesis in favor of the
# alternative. May much drier than August.
(2) Suppose 15 randomly chosen people of varying ages are tested for maximum heart rate and the following data are found.
Age = c(18, 23, 25, 35, 65, 54, 34, 56, 72, 19, 23, 42, 18, 39, 37)
HR = c(202, 186, 187, 180, 156, 169, 174, 172, 153, 199, 193, 174, 198, 183,
178)
require(ggplot2)
## Loading required package: ggplot2
data <- data.frame(Age, HR)
ggplot(data, aes(x = Age, y = HR)) + geom_point(shape = 1) + geom_smooth(method = lm)
fit <- lm(HR ~ Age)
fit
##
## Call:
## lm(formula = HR ~ Age)
##
## Coefficients:
## (Intercept) Age
## 210.048 -0.798
summary(fit)
##
## Call:
## lm(formula = HR ~ Age)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.926 -2.538 0.388 3.187 6.624
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 210.048 2.867 73.3 < 2e-16 ***
## Age -0.798 0.070 -11.4 3.8e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.58 on 13 degrees of freedom
## Multiple R-squared: 0.909, Adjusted R-squared: 0.902
## F-statistic: 130 on 1 and 13 DF, p-value: 3.85e-08
predict(fit, data.frame(Age = 50))
## 1
## 170.2
The age is in years and the heart rate is in beats per minute.
(a) Determine the intercept and slope of a regression model of heart rate on age. (10)
The intercept and slope of the regression model are 210.048 and -0.798, respectively.
(b) Predict the average heart rate of a 50-year old. (10)
Based on the regression model, the average heart rate of a 50 year old is 170.2 b.p.m