Hello, and welcome to the 6th edition of Learning Logs. I’m glad you’ve made it this far, but go you for keeping the motivation up.

In this log, I will be testing our null hypothesis for a multiple linear regression model, in which I will see if time can be predicted by year and gender of swimmer. Our null hypothesis for our model will be whether the coefficient for the year is equal to 0 or not.

We begin by importing our data and creating a linear model:

Swim<-read.csv("file:///C:/Users/Peter/Downloads/Swim100M.csv")
attach(Swim)
swimmod<- lm(time~year*sex)
summary(swimmod)
## 
## Call:
## lm(formula = time ~ year * sex)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.3484 -1.4409 -0.2894  0.5404 15.9783 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  697.30122   39.22143  17.779  < 2e-16 ***
## year          -0.32405    0.02010 -16.118  < 2e-16 ***
## sexM        -302.46384   56.41163  -5.362 1.49e-06 ***
## year:sexM      0.14992    0.02889   5.189 2.83e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.32 on 58 degrees of freedom
## Multiple R-squared:  0.8935, Adjusted R-squared:  0.8879 
## F-statistic: 162.1 on 3 and 58 DF,  p-value: < 2.2e-16
View(Swim)

We can see our t-value for the year is =-16.118. In addition, we look at our p-value, and compare it to a given alpha level of .05. p<alpha, thus we can reject our null hypothesis, and note that there is a weight for our variables.

Confidence Intervals

confint(swimmod)
##                     2.5 %       97.5 %
## (Intercept)  618.79099264  775.8114386
## year          -0.36428889   -0.2838028
## sexM        -415.38399282 -189.5436848
## year:sexM      0.09208029    0.2077528

Thus, for males, we are 95% confident, that each additional year would decrease our time by an additional .28 seconds.

Let’s now predict what times we may see, based on our coefficients.

data1<-data.frame(year=2018, sex="M")
conf1 <- predict(swimmod, data1, interval = "confidence")
pred1 <- predict(swimmod, data1, interval = "predict")
conf1
##        fit      lwr      upr
## 1 43.44445 40.51564 46.37326
pred1
##        fit      lwr      upr
## 1 43.44445 36.18268 50.70621

This shows us a 95% confidence/prediction interval on a male time for the year 2018