golf=read.table('http://www.stat.ufl.edu/~winner/data/pgalpga2008.dat')
colnames(golf) <- c('drive_distance', 'accuracy', 'gender')
golf_female <- subset(golf,gender==1)
golf_male <- subset(golf,gender==2)
We fit a linear regression model to the female golfer data.
golf_female_lm = lm(accuracy~drive_distance,data=golf_female)
summary(golf_female_lm)
##
## Call:
## lm(formula = accuracy ~ drive_distance, data = golf_female)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.6777 -2.6583 0.9829 3.6346 10.2339
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 130.89331 10.92765 11.978 < 2e-16 ***
## drive_distance -0.25649 0.04424 -5.797 3.66e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.246 on 155 degrees of freedom
## Multiple R-squared: 0.1782, Adjusted R-squared: 0.1729
## F-statistic: 33.61 on 1 and 155 DF, p-value: 3.662e-08
Based on the linear regression, determine the accuracy
based on a drive_distance of 260 yards.
Therefore, accuracy=coef(golf_female_lm)[1] + coef(golf_female_lm)[2]*260
coef(golf_female_lm)[1] + coef(golf_female_lm)[2]*260
## (Intercept)
## 64.20573
Determine the 95% posterior predictive interval for the
accuracy of new female golfer whose average driving
distance is 260 yards.
predict(golf_female_lm,data.frame(drive_distance=260),interval="predict")
## fit lwr upr
## 1 64.20573 53.74528 74.66619