Ed wants us to generate a 50 period random walk. Here is the outline:
random_walk_1 = function(x){
set.seed(1234)
vt=c()
eps = rnorm(x, mean=0, sd= 1)
vt[1] = rnorm(1, mean = 0, sd = 1)
for (i in 2:50){
vt[i] = vt[i-1] + eps[i]
}
return(vt)
}
vec <- random_walk_1(x = 50)
head(vec)
## [1] -1.8060313 -1.5286020 -0.4441608 -2.7898585 -2.3607339 -1.8546780
Now we need to generate a seperate 50 period random walk. Change the seed, which will generate a different series
random_walk_2 = function(x){
set.seed(456)
w=c()
eps = rnorm(x, mean=0, sd= 1)
w[1] = rnorm(1,mean = 0, sd = 1)
for (t in 2:x){
w[t] = w[t-1]+eps[t]
}
return(w)
}
df1 <- random_walk_2(50)
head(df1)
## [1] -0.66495083 -0.04317528 0.75769939 -0.63119303 -1.34554989 -1.66961094
c: no. We shouldn’t! Why?
eta on gammaregression_data = data.frame(
wt = random_walk_2(50),
vt = random_walk_1(50)
)
regression_data
## wt vt
## 1 -0.66495083 -1.8060313
## 2 -0.04317528 -1.5286020
## 3 0.75769939 -0.4441608
## 4 -0.63119303 -2.7898585
## 5 -1.34554989 -2.3607339
## 6 -1.66961094 -1.8546780
## 7 -0.97896794 -2.4294179
## 8 -0.72842004 -2.9760498
## 9 0.27893223 -3.5405018
## 10 0.85216691 -4.4305396
## 11 -0.06364361 -4.9077323
## 12 1.24745384 -5.9061187
## 13 2.23618016 -6.6823726
## 14 3.89010884 -6.6179138
## 15 2.44930361 -5.6584198
## 16 4.39666004 -5.7687053
## 17 6.13359622 -6.2797148
## 18 6.52107955 -7.1909102
## 19 8.80111352 -8.0280819
## 20 10.33899683 -5.6122467
## 21 9.86439302 -5.4781585
## 22 8.14708422 -5.9688444
## 23 6.72025391 -6.4093922
## 24 6.92848983 -5.9498028
## 25 6.89265365 -6.6435230
## 26 8.02693822 -8.0917280
## 27 7.56408325 -7.5169722
## 28 7.23569922 -8.5406280
## 29 8.72023869 -8.5557663
## 30 7.63086079 -9.4917149
## 31 7.10206658 -8.3894173
## 32 6.50827372 -8.8650104
## 33 4.50935807 -9.5744504
## 34 4.80551120 -10.0757085
## 35 4.97613645 -11.7048020
## 36 6.79178877 -12.8724212
## 37 6.13118567 -15.0524609
## 38 5.99093372 -16.3934541
## 39 5.56695462 -16.6877479
## 40 5.52821884 -17.1536455
## 41 5.49927692 -15.7041492
## 42 5.89231430 -16.7727919
## 43 5.64270038 -17.6281565
## 44 5.72615059 -17.9087796
## 45 7.80502521 -18.9031196
## 46 7.92587701 -19.8716339
## 47 8.04402645 -20.9789521
## 48 8.81408067 -22.2309380
## 49 7.63867826 -22.7547661
## 50 8.04771682 -23.2516161
silly_model = lm(regression_data$wt~regression_data$vt, data=regression_data)
summary(silly_model)
##
## Call:
## lm(formula = regression_data$wt ~ regression_data$vt, data = regression_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.2385 -1.7885 -0.4711 2.4668 6.6124
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.99739 0.71937 2.777 0.00781 **
## regression_data$vt -0.30812 0.06247 -4.933 1.01e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.78 on 48 degrees of freedom
## Multiple R-squared: 0.3364, Adjusted R-squared: 0.3226
## F-statistic: 24.33 on 1 and 48 DF, p-value: 1.014e-05
That’s no good!
Now we need to see what happens with differences…
regression_data = data.frame(
wt = random_walk_2(50),
vt= random_walk_1(50)
)
regression_data_diff = data.frame(
w_diff = regression_data$wt-lag(regression_data$wt),
v_diff= regression_data$vt - lag(regression_data$vt)
)
#Alternatively,
regression_data_diff = data.frame(
w_diff = diff(regression_data$wt),
v_diff= diff(regression_data$vt)
)
silly_model_2 = lm(w_diff ~ v_diff, data=regression_data_diff)
summary(silly_model_2)
##
## Call:
## lm(formula = w_diff ~ v_diff, data = regression_data_diff)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.16507 -0.68408 -0.03094 0.62717 2.11936
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.19658 0.16846 1.167 0.249
## v_diff 0.04289 0.17166 0.250 0.804
##
## Residual standard error: 1.055 on 47 degrees of freedom
## Multiple R-squared: 0.001326, Adjusted R-squared: -0.01992
## F-statistic: 0.06242 on 1 and 47 DF, p-value: 0.8038
Look at that!
y_1 = c(25,15,11,13)
y_0 = c(17,11,3,9)
#to build a vector of ALL treatment effects...
teffect = y_1-y_0
teffect
## [1] 8 4 8 4
mean(teffect)
## [1] 6
\[Avg(y_i|D_i =1) - Avg(y_i |D_i = 0) \] you can do this.
I’ll give you the opportunity to figure out how to solve it…
it would require you to see the same person do different things. You need to visit parallel universes, where everyone does everything.
I’ll let you guys do this one on your own.
I think you guys can solve this one. If you need help: look at the last lab, or Ed’s last lecture.