Generate dataset with the following equation:
y = 2 + 0.5x + eps where eps is normally distributed.
obs=100
df=data.frame(eps=rnorm(obs))
df$x=runif(obs)
df$y=2+(0.5*df$x)+df$eps
## eps x y
## 1 0.02191034 0.2643559 2.154088
## 2 -0.26062792 0.4444910 1.961618
## 3 -0.40189602 0.6104678 1.903338
## 4 0.98685095 0.3033426 3.138522
## 5 0.91886667 0.6279495 3.232841
Perform some regression analysis and summarise the data.
reg=lm(y~x,df)
print(summary(reg))
##
## Call:
## lm(formula = y ~ x, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3501 -0.6044 0.0167 0.5828 2.4912
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.05652 0.19197 10.71 <2e-16 ***
## x 0.08171 0.32745 0.25 0.803
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9955 on 98 degrees of freedom
## Multiple R-squared: 0.000635, Adjusted R-squared: -0.009563
## F-statistic: 0.06226 on 1 and 98 DF, p-value: 0.8035
Run the model 1000 times so the distribution can be evaluated.
for (i in 1:1000) {
df=data.frame(eps=rnorm(obs))
df$x=runif(obs)
df$y=2+0.5*df$x+df$eps
beta[i]<-lm(y~x,df)$coefficients[[2]]
}
Plot the distribution in a histogram to visualise the distribution.