Simple demo of errors on slope parameter

Generate a synthetic data set from a simple model:

genDataSet <- function(n) {
 x <- rnorm(n)
 y <- 2 * x + rnorm(n)
 data.frame(x,y)
}

genDataSet(n) will create fake data. As an example:

set.seed(1)
test = genDataSet(100)
plot(test)

Which can be fit quite easily:

summary(lm(y~x, test))

## 
## Call:
## lm(formula = y ~ x, data = test)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8768 -0.6138 -0.1395  0.5394  2.3462 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.03769    0.09699  -0.389    0.698    
## x            1.99894    0.10773  18.556   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9628 on 98 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7762 
## F-statistic: 344.3 on 1 and 98 DF,  p-value: < 2.2e-16

The resulting slope is fit to 2.0 +- 0.1

Now lets generate multiple synthetic data sets and fit the slope over and over again to get a feel for what the 0.1 standard error means. Each data set is a possible realization of the model.

genSlope <- function(n){
  test <- lm(y~x, genDataSet(n))
  test$coefficients[2]
}

slopeData = sapply(rep(100, 100), genSlope)
hist(slopeData)

mean(slopeData)

## [1] 2.001821

sd(slopeData)

## [1] 0.1028822

So with the standard error is about 0.1, as calculated from summary(lm)