Review Exercises

EC3133

2024-12-06

Problem 1: Ordinary Least Squares (OLS)

Question

A company wants to understand the relationship between advertising expenditure and sales.

Advertising and Sales Data
Month Advertising Sales
1 2 4
2 3 5
3 5 7
4 7 10
5 9 15

Manual Solution

The OLS estimators are:

\[ \beta_1 = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sum (X_i - \bar{X})^2} \] \[ \beta_0 = \bar{Y} - \beta_1\bar{X} \]

# Calculate means
X_bar <- mean(advertising_data$Advertising)
Y_bar <- mean(advertising_data$Sales)

# Calculate beta1
numerator <- sum((advertising_data$Advertising - X_bar) * 
                 (advertising_data$Sales - Y_bar))
denominator <- sum((advertising_data$Advertising - X_bar)^2)
beta1_manual <- numerator/denominator

# Calculate beta0
beta0_manual <- Y_bar - beta1_manual * X_bar

# Print results
cat("Manual Calculation: \
")
## Manual Calculation:
cat("β₁ =", round(beta1_manual, 3), "\
")
## β₁ = 1.518
cat("β₀ =", round(beta0_manual, 3))
## β₀ = 0.305

Direct R Calculation

# Use lm() to calculate OLS coefficients
ols_model <- lm(Sales ~ Advertising, data = advertising_data)
summary(ols_model)$coefficients
##             Estimate Std. Error  t value    Pr(>|t|)
## (Intercept) 0.304878  1.0435206 0.292163 0.789201842
## Advertising 1.518293  0.1800244 8.433816 0.003498189
# Create scatter plot with regression line
ggplot(advertising_data, aes(x = Advertising, y = Sales)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  labs(title = "Sales vs. Advertising",
       x = "Advertising Expenditure",
       y = "Sales") +
  theme_minimal()

Problem 2: Instrumental Variables (IV)

Question

Estimating returns to education using parental education as an instrument.

Education and Income Data
Individual Education Income ParentalEd
1 10 30 12
2 12 35 14
3 8 25 10
4 15 50 16
5 9 28 11

Manual Solution

Two-Stage Least Squares (2SLS):

Stage 1: Regress \(X\) on \(Z\) \[ X = \pi_0 + \pi_1Z + v \]

Stage 2: Regress \(Y\) on \(\hat{X}\) \[ Y = \beta_0 + \beta_1\hat{X} + u \]

# First stage regression
stage1_manual <- lm(Education ~ ParentalEd, data = education_data)
education_data$fitted_education_manual <- fitted(stage1_manual)

# Second stage regression
stage2_manual <- lm(Income ~ fitted_education_manual, data = education_data)
summary(stage2_manual)$coefficients
##                          Estimate Std. Error    t value    Pr(>|t|)
## (Intercept)             -3.428571  6.4163816 -0.5343466 0.630164652
## fitted_education_manual  3.428571  0.5791589  5.9199150 0.009629824

Direct R Calculation

# First stage
stage1 <- lm(Education ~ ParentalEd, data = education_data)
education_data$fitted_education <- fitted(stage1)

# Second stage
stage2 <- lm(Income ~ fitted_education, data = education_data)
summary(stage2)$coefficients
##                   Estimate Std. Error    t value    Pr(>|t|)
## (Intercept)      -3.428571  6.4163816 -0.5343466 0.630164652
## fitted_education  3.428571  0.5791589  5.9199150 0.009629824

Problem 3: Maximum Likelihood Estimation (MLE)

Question

Estimating the parameter of an exponential distribution for light bulb lifetimes.

Light Bulb Lifetime Data
Bulb Lifetime
1 1000
2 1200
3 800
4 950
5 1100

Manual Solution

For exponential distribution: \[ \hat{\lambda} = \frac{1}{\bar{y}} \]

# Manual calculation of lambda
lambda_manual <- 1/mean(bulb_data$Lifetime)
cat("Manual Calculation: \
")
## Manual Calculation:
cat("λ =", round(lambda_manual, 6))
## λ = 0.00099

Direct R Calculation

# Direct calculation using MLE
lambda_direct <- 1/mean(bulb_data$Lifetime)
cat("Direct Calculation: \
")
## Direct Calculation:
cat("λ =", round(lambda_direct, 6))
## λ = 0.00099
# Plot histogram with fitted exponential density
ggplot(bulb_data, aes(x = Lifetime)) +
  geom_histogram(aes(y = ..density..), bins = 10, fill = "lightblue", color = "black") +
  stat_function(fun = dexp, args = list(rate = lambda_direct), color = "red") +
  labs(title = "Light Bulb Lifetimes with Fitted Exponential Distribution",
       x = "Lifetime (hours)",
       y = "Density") +
  theme_minimal()

Problem 4: Method of Moments (MoM)

Question

Estimating parameters of a normal distribution for height data.

Estimating parameters of a normal distribution for height data.

Height Data
Individual Height
1 160
2 170
3 165
4 175
5 180

Manual Solution

For a normal distribution, the method of moments estimators are:

\[ \hat{\mu} = \bar{Y} \] \[ \hat{\sigma}^2 = \frac{1}{n} \sum (Y_i - \bar{Y})^2 \]

# Manual calculation of mean and variance
mu_manual <- mean(height_data$Height)
sigma2_manual <- var(height_data$Height)
## Manual Calculation:
## μ̂ = 170
## σ̂² = 62.5

Direct R Calculation

# Direct calculation using R functions
mu_direct <- mean(height_data$Height)
sigma2_direct <- var(height_data$Height)
## Direct Calculation:
## μ̂ = 170
## σ̂² = 62.5

Now let’s plot the histogram with fitted normal density:

Things to Remember

  1. OLS is suitable for simple linear relationships with no endogeneity
  2. IV helps address endogeneity through valid instruments
  3. MLE is efficient but requires distributional assumptions
  4. MoM is simple and robust but may not be efficient

Remember: