Trainset13

MOOC Econometrics Training Exercise 1.3

Notes: • This exercise uses the datafile TrainExer13 and requires a computer. • The dataset TrainExer13 is available on the website.

Questions

Dataset TrainExer13 contains the winning times (W) of the Olympic 100-meter finals (for men) from 1948 to 2004.The calendar years 1948-2004 are transformed to games (G) 1-15 to simplify computations. A simple regression model for the trend in winning times is Wi = α + βGi + εi . (a) Compute a and b, and determine the values of R2and s. (b) Are you confident on the predictive ability of this model? Motivate your answer. (c) What prediction do you get for 2008, 2012, and 2016? Compare your predictions with the actual winning times

trainset13 <- read.csv("~/Google Drive/MOOC - courses/ecnometrics-Erasmus/TrainExer13.csv")

Compute a and b, and determine the values of R2and s.

test<-lm(trainset13$Winning.time.men~trainset13$Game)
cor(trainset13$Winning.time.men,trainset13$Game)*cor(trainset13$Winning.time.men,trainset13$Game)

## [1] 0.6733729

summary(test)

## 
## Call:
## lm(formula = trainset13$Winning.time.men ~ trainset13$Game)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.208 -0.048 -0.016  0.032  0.228 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     10.38600    0.06674 155.623  < 2e-16 ***
## trainset13$Game -0.03800    0.00734  -5.177 0.000178 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1228 on 13 degrees of freedom
## Multiple R-squared:  0.6734, Adjusted R-squared:  0.6482 
## F-statistic:  26.8 on 1 and 13 DF,  p-value: 0.0001781

Ans: a= 10.386 b= -0.038 R^2 = 10.386 Residual error using 13 degrees of freedom = 0.1228

Are you confident on the predictive ability of this model? Motivate your answer.

No. From the R2 value, we understand that the model explains only 33% of the data.

What prediction do you get for 2008, 2012, and 2016? Compare your predictions with the actual winning times

predict2008 <-  10.386- 0.038*16
predict2012 <- 10.386 - 0.038*17
predict2016<- 10.386 - 0.038*18
predict2008

## [1] 9.778

predict2012

## [1] 9.74

predict2016

## [1] 9.702

I dont understand how we can get the actual values for 2008, 2012 and 2016. The course does not explain in the lecture clearly.. Did they get it from net???