1 Statistical Properties of OLS Estimators

What are the properties of the distributions of \(\beta_0\) and \(\beta_1\) over different random samples from the population?
What are the expected values and variances of OLS estimators?
We will first examine finite sample properties: unbiasedness and efficiency. These are valid for any sample size n.
Recall that unbiasedness means that the mean of the sampling distribution of an estimator is equal to the unknown parameter value.
Efficiency is related to the variance of the estimators.
An estimator is said to be efficient if its variance is the smallest among a set of unbiased estimators.

2 Unbiasedness of OLS Estimators

We need the following assumptions for unbiasedness:

(SLR.1) Model is linear in parameters: \(y = \beta_0 + \beta_1x + u\)
(SLR.2) Random sampling: we have a random sample from the target population.
(SLR.3) The variance of \(x\) must not be zero: \(\sum_{i=1}^n (x_i - \overline{x}\)
(SLR.4) Zero conditional mean: \(E(u|x) = 0\). Since we have a random sample we can write:

\[ E(u_i|x_i) = 0, \; \forall i = 1,2,\cdots,n \]

THEOREM:

If all SLR.1-SLR.4 assumptions hold then OLS estimators are unbiased:

\[\begin{align} E(\hat{\beta}_0) &= \beta_0 \notag \\ E(\hat{\beta}_1) &= \beta_1 \notag \end{align}\]

Proof: (see Wooldridge, pp 43-44)

2.1 Notes on Unbiasedness

Unbiasedness is feature of the sampling distributions of \(\beta_0\) and \(\beta_1\) that are obtained via repeated random sampling.
As such, it does not say anything about the estimate that we obtain for a given sample. It is possible that we could obtain an estimate which is far from the true value.
Unbiasedness generally fails if any of the SLR assumptions fail.
SLR.2 needs to be relaxed for time series data. But there are ways that it cannot hold in cross-sectional data as well.
If SLR.4 fails then the OLS estimators will generally be biased. This is the most important issue in nonexperimental data. If \(x\) and \(u\) are correlated then we have biased estimators.
Spurious correlation: we find a relationship between \(y\) and \(x\) that is really due to other unobserved factors that affect \(y\)

2.2 Unbiasedness of OLS: A Simple Monte Carlo Experiment

Population model (DGP - Data Generating Process): \[ y = 1 + 0:5x + 2 × N(0; 1)\ \]
True parameter values are known: \(\beta_0 = 1, \beta_1 = 0:5, u = 2 × N(0; 1)\) (what is the variance of u?). N(0; 1) represents a random draw from the standard normal distribution.
The values of \(x\) are drawn from the Uniform distribution: \(x \sim 10 × Unif(0; 1)\)
Using random numbers we can generate artificial data sets. Then, for each data set we can apply the OLS method to find estimates.
After repeating these steps many times, say 1000, we would obtain 1000 slope and intercept estimates. Then we can analyze the sampling distribution of these estimates.
This is a simple example of Monte Carlo simulation experiment. These experiments may be useful in analyzing properties of estimators.

# Set the random seed
# So that we will obtain the same results 
# Otherwise, simulation results will change 
set.seed(1234567)

# set sample size 
n <- 50 
# the number of simulations
MCreps <- 10000

# set true parameters: betas and standard deviation of u
beta0 <- 1 
beta1 <- 0.5 
su <- 2

# initialize b0hat and b1hat to store results later:
b0hat <- numeric(MCreps)
b1hat <- numeric(MCreps)

# Draw a sample of x 
# this is going to be fixed in repeated samples 
x <- 10*runif(n,0,1)

# repeat MCreps times:
for(i in 1:MCreps) {
  print(i)
  # Draw a sample of y:
  u <- rnorm(n,0,su)
  y <- beta0 + beta1*x + u
  # estimate parameters by OLS and store them in the vectors
  bhat <- coefficients( lm(y~x) )
  b0hat[i] <- bhat["(Intercept)"]
  b1hat[i] <- bhat["x"]
}
# draw histogram and summary statistics
hist(b0hat)

summary(b0hat)
mean(b0hat)
sd(b0hat)

hist(b1hat)

summary(b1hat)
mean(b1hat)
sd(b1hat)

# smoothed histogram 
hist(b1hat, 
     freq = FALSE, 
     breaks=seq(0,1,0.025), 
     axes = FALSE, 
     main=expression("Sampling Distribution of b1hat"))

axis(1,at = seq(0,1,0.1),labels = TRUE,pos = 0)
axis(2,pos = 0)
lines(density(b1hat), lwd=2, col="blue")

hist(b0hat, 
     freq = FALSE, 
     breaks=seq(-2,4,0.1), 
     axes = FALSE, 
     main="Sampling Distribution of b0hat")

axis(1,at = seq(-1,3,1),labels = TRUE,pos = 0)
axis(2,pos = -2)
lines(density(b0hat), lwd=2, col="blue")

3 Variances of the OLS Estimators

Unbiasedness of OLS estimators, \(\beta_0\) and \(\beta_1\) is a feature about the center of the sampling distributions.
We should also know how far we can expect \(\hat{\beta}_1\) to be away from \(\beta_1\) on average.
In other words, we should know the sampling variation in OLS estimators in order to establish efficiency and to calculate standard errors.
SLR.5: Homoscedasticity (constant variance assumption): This says that the variance of \(u\) conditional on \(x\) is constant, \(var(u|x) = var(u) = \sigma^2\)
Assumptions SLR.4 and SLR.5 can be rewritten in terms of the conditional mean and variance of \(y\):

\[\begin{align} E(y|x) &= \beta_0 + \beta_1 x \notag \\ var(y|x) &= \sigma^2 \notag \end{align}\]

Simple Regression Model under Homoscedasticity

Simple Regression Model under Hoeteroscedasticity

3.1 Sampling Variances of the OLS Estimators

Under assumptions SLR.1 through SLR.5:

\[\begin{align} Var(\hat{\beta}_0) &= \frac{\sigma^2}{\sum_{i=1}^n (x_i = \overline{x})^2} \notag \\ \text{and} \notag \\ Var(\hat{\beta}_1) &= \frac{\sigma^2 \sum_{i=1}^n x_i^2}{n \sum_{i=1}^n (x_i = \overline{x})^2} \notag \end{align}\]

These formulas are not valid under heteroscedasticity (if SLR.5 does not hold).
Sampling variances of OLS estimators increase with the error variance and decrease with the sampling variation in \(x\)

3.2 Estimating the Error Variance

We would like to find an unbiased estimator for \(\sigma^2\).
Since by assumption we have \(E(u^2) = \sigma^2\) an unbiased estimator is:

\[ \frac{1}{n}\sum_{i=1}^n u_i^2 \]

But we cannot use this because we do not observe \(u\). Replacing the errors with the residuals:

\[ \frac{1}{n}\sum_{i=1}^n \hat{u}_i^2 = \frac{SSE}{n} \]

However, this estimator is biased. We need to make degrees of freedom adjustment. Thus, the unbiased estimator is:

\[ \frac{1}{n}\sum_{i=1}^n \hat{u}_i^2 = \frac{SSE}{n-2} \]

degrees of freedom (df) = number of observations - number of parameters = \(n-2\)

3.3 Standard Errors of OLS estimators

The square root of the variance of the error term is called the standard error of the regression (SER) or more commonly referred to as Root Mean Square Error (RMSE):

\[ \hat{\sigma} = \sqrt{\frac{SSE}{n-2}} \] - Standard error of the OLS slope estimate can be written as:

\[ se(\hat{\beta}_1) = \frac{\hat{\sigma}}{\sqrt{{\sum_{i=1}^n} (x_i - \overline{x})^2}} = \frac{\hat{\sigma}}{s_x} \]

Standard errors summarize the uncertainty surrounding the coefficient estimates.

4 Regression Through the Origin

In some rare cases we want y = 0 when x = 0. For example, tax revenue is zero whenever income is zero.
We can redefine the simple regression model without the constant term as follows: \(\tilde{y} = \tilde{\beta}_1x\)
Using OLS principle

\[ min \sum_{i=1}^n (\tilde{y} - \tilde{\beta}_1x_i)^2 \]

First Order Condition:

\[ \sum_{i=1}^n x_i (\tilde{y} - \tilde{\beta}_1x_i) = 0 \]

Solving this we obtain the OLS estimator of the slope parameter: \[ \tilde{\beta}_1 = \frac{\sum_{i=1}^nx_iy_i}{\sum_{i=1}^n x_i^2} \]
For example,

# Regression through the origin:
res1 <- lm(salary ~ 0 + roe, data = ceosal1)
summary(res1)

## 
## Call:
## lm(formula = salary ~ 0 + roe, data = ceosal1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1697.4  -309.1   -34.3   459.2 13589.4 
## 
## Coefficients:
##     Estimate Std. Error t value Pr(>|t|)    
## roe   63.538      5.156   12.32   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1429 on 208 degrees of freedom
## Multiple R-squared:  0.422,  Adjusted R-squared:  0.4193 
## F-statistic: 151.9 on 1 and 208 DF,  p-value: < 2.2e-16

# Regression on a constant
res2 <- lm(salary ~ 1, data = ceosal1)
summary(res2)

## 
## Call:
## lm(formula = salary ~ 1, data = ceosal1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1058.1  -545.1  -242.1   125.9 13540.9 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1281.12      94.93    13.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1372 on 208 degrees of freedom

#Full SLR
res3 <- lm(salary ~ roe, data = ceosal1)
summary(res3)

## 
## Call:
## lm(formula = salary ~ roe, data = ceosal1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1160.2  -526.0  -254.0   138.8 13499.9 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   963.19     213.24   4.517 1.05e-05 ***
## roe            18.50      11.12   1.663   0.0978 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1367 on 207 degrees of freedom
## Multiple R-squared:  0.01319,    Adjusted R-squared:  0.008421 
## F-statistic: 2.767 on 1 and 207 DF,  p-value: 0.09777

plot(x= ceosal1$roe, 
     y = ceosal1$salary, 
     ylim = c(0,4000),
     xlab = "Return on equity",
     ylab = "CEO salary")
abline(res1,col="blue")
abline(res2,col="red")
abline(res3,col="black")

Obtaining an estimate of \(\beta_1\) using regression through the origin is not done very often in applied work, and for good reason: if the intercept \(\beta_0 \neq 0\), then \(\tilde{\beta}_1\) is a biased estimator of \(\beta_1\).

Econ 115s (Introduction to Econometrics)

Lesson 1.4 (Properties of OLS Estimators)

NE Milla, Jr.

2023-03-05