Note on Assignments
Assignments are not stand alone and are designed to be answered in conjunction with lecture notes and case studies. You need to follow the R-code taught in the course when completing the assignment. Alternative R code (and interpretation) not taught in the course and extraneous R output (and interpretation) included in your answers can lead to deductions in marks. Note: AI use frequently will generate alternative code and interpretation that does not follow the course material.
A researcher was interested in the relationship between the age of drivers and the maximum distance at which a newly designed sign is legible. Data was collected from a random sample of 30 drivers. The age of the driver and the maximum distance at which they could read the newly designed road sign were recorded.
The data is stored in the file road.csv and contains the following variables for each driver:
| Variable | Description |
|---|---|
| Distance | the maximum distance at which the driver could read the sign (in metres). |
| Age | the age of the driver (in years). |
We are interested in whether the age of the driver affects the maximal distance that a road sign is legible.
Road.df=read.csv("road.csv",header=T)
plot(Distance~Age, main="Maximum legible distance versus Age",data=Road.df)
road.fit = lm(Distance ~ Age, data=Road.df)
summary(road.fit)
##
## Call:
## lm(formula = Distance ~ Age, data = Road.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.876 -12.726 2.327 10.253 33.173
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 175.7648 7.1561 24.562 < 2e-16 ***
## Age -0.9165 0.1294 -7.084 1.05e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.17 on 28 degrees of freedom
## Multiple R-squared: 0.6419, Adjusted R-squared: 0.6291
## F-statistic: 50.18 on 1 and 28 DF, p-value: 1.046e-07
Road.lm=lm(Distance~Age,data=Road.df)
modelcheck(Road.lm)
summary(Road.lm)
##
## Call:
## lm(formula = Distance ~ Age, data = Road.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.876 -12.726 2.327 10.253 33.173
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 175.7648 7.1561 24.562 < 2e-16 ***
## Age -0.9165 0.1294 -7.084 1.05e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.17 on 28 degrees of freedom
## Multiple R-squared: 0.6419, Adjusted R-squared: 0.6291
## F-statistic: 50.18 on 1 and 28 DF, p-value: 1.046e-07
confint(Road.lm)
## 2.5 % 97.5 %
## (Intercept) 161.106186 190.4234214
## Age -1.181517 -0.6514817
plot(Distance~Age, main="Maximum legible distance versus Age",data=Road.df)
plot(Distance~Age,data=Road.df, xlim=c(0,80), ylim=c(0,180))
abline(road.fit, lty=2, col = "green")
prediction=predict(road.fit, newdata=data.frame(Age=c(20,40,60,80)))
points(c(20,40,60,80), prediction, col = "green", pch=19)
As the initial plot of the data shows a linear relationship, we have fitted a linear regression model to our data. The data is from a random sample so we can assume independence. The residual plot shows randomness around zero with constant variance. The residuals look like they come from a normal distribution. There are no overly unduly influential points. All assumptions appear to be satisfied. The effect of the driver’s age is statistically significant.
Our model is:
{Distance = 175.7648 - 0.9165 x Age + \(\epsilon\)} where \(\epsilon_i \sim iid ~ N(0,\sigma^2)\)
Our model explains 62.91% of the variation in the response variable.
We are interested in whether the age of the driver affects the maximal distance that a road sign is legible.
Based on our findings, we can confidently say there is a negative relationship (p-value = 1.05e-07) between the age of a driver, and the maximal distance that a road sign is legible. For each year that age increases, the average maximum legible distance decreased by 0.9165m.
1.3 Comment on the plots
There looks to be a negative relationship between age and maximum legible distance; as age increases, the maximum legible distance decreases.