Example

ECOM30001/ECOM90001
Basic Econometrics
Semester 1, 2025

Week 3, Lecture 1

Basic Linear Model: Model Specification

Reading: Hill et. al. Chapter 6

  1. Hypothesis Testing about more than one parameter: The F-test
    • Testing the significance of a Model
  2. Model Specification
    • Some General Considerations
    • Interpretation in Log Models
    • A Reciprocal Model
    • Simulation Example

Lecture Objectives

Basic Linear Model: Model Specification I

  1. Testing the significance of a Model using an F-test
  2. Model Specification: Functional Form
    • What are the general issues guiding choice of appropriate functional form?
    • How to interpret parameters in a linear, log-linear, linear-log and log-log model?
    • How to interpret the parameters in a reciprocal model
    • Some general principles for choosing functional form

The F-test

  • consider the (general) econometric model \[\color {blue}{y_i=\beta_0+\beta_1\,X_{1i} + \dots+\beta_K\, X_{Ki} + \varepsilon_i}\]

  • if \(\color{green}{\varepsilon_i}\) are normally distributed (or N is sufficiently large), we can test one or hypotheses about the unknown population parameters using the F-test

  • how: compare the sum of squared residuals (RSS) from an unrestricted model to the RSS from a restricted model in which H0 is assumed to be true


Unrestricted and Restricted Model

  • recall the OLS principle:

    • the OLS estimators are the solution to a minimization problem (minimize the sum of squared errors)
  • the sum of squared residuals ( RSS) will represent the minimized value of the objective function—the sum of squared errors evaluated at the solution to the minimization problem \(\color {blue}{b_0,b_1,b_2, \dots}\) etc.

  • in our example, the restricted model imposes \(\color {red}{\beta_2= 0}\)

  • the minimized value of the objective function for the restricted model can never be smaller than that achieved for the unrestricted model. Why?


The F test continued

  • By the property of the minimum, the \(\color{green}{RSS}\) associated with the restrictions \(\color{red}{(RSS_R)}\) cannot be lower than the \(\color{green}{RSS}\) associated with no restrictions \(\color{red}{(RSS_{UR})}\).

  • if the null hypothesis is true, we expect that the difference in the RSS associated with restrictions compared to the RSS without the restrictions should be small

  • if the null hypothesis is not true, we expect that the RSS associated with the restrictions is considerably larger than the RSS associated without the restrictions.

    • intuitively, the (false) null hypothesis has significantly reduced the ability of the model to fit the data (and considerably raised the RSS).
  • By the property of the minimum \(\color{red}{\left(RSS_R-RSS_{UR} \right) \geq 0}\)

  • Let \(\color {green}{M}\) denote the number of restrictions on the unknown parameters and define the random variable \[\color{blue}{F=\dfrac{\left(RSS_R-RSS_{UR} \right)/M}{\left(RSS_R-RSS_{UR} \right)/(N_K-1)}} \thicksim F(N-K-1)\] if \(H_0\) is true this random variable \(F\) follows a F-distribution with degrees of freedom \(M\) and \((N-K-1)\)

  • where \[ \begin{align} \color{green}{RSS_R} &= {\color{green}{\sum \hat{e}_i^2}} \text{ under } H_0 \text{ imposing the restrictions}\\ \color{green}{RSS_{UR}} &= {\color{green}{\sum \hat{e}_i^2}} \text{ in the unrestricted model } \end{align} \]

  • By the property of the minimum \(\color{red}{\left(RSS_R-RSS_{UR} \right) \geq 0}\)

  • if the null hypothesis is not true \(\color{red}{\left(RSS_R-RSS_{UR} \right)}\) should be relatively large

    • the restrictions considerably reduce the ability of the model to fit the data—and the sample value of F-statistic becomes relatively large
  • when the sample F-statistic becomes ‘sufficiently large’ we will reject \(H_0\)

  • the judgement about what value is ‘too large’ is evaluated by comparing the sample value to some \(F_c\) such that: \[ {\color{blue}{Pr[F \geq F_c]=\alpha}} \quad \text{ where } \alpha \text{ is the level of significance}\]

F-tests and t-tests

  • the null hypothesis \(\color{red}{\beta_2=0}\) is a test regarding a single restriction. In this case, we can use a t-test to test the hypothesis.

  • you will notice that the p-value associated with the F-test statistic and p-value associated with the t-test statistic are identical

  • Result: for a two-tailed test about a single coefficient (i.e. a single restriction), we have \[\color{blue}{\{t(N-K-1)\}}^2=F(1,N-K-1)\] the square of a t random variable with \(\color{green}{N-K-1}\) df is an F random variable with distribution \(\color{green}{F(1,N-K-1)}\)

  • this equivalence does not hold for a one-tailed test

Testing the Significance of a model

  • in the MLRM, testing the overall significance if the model amounts to testing whether there is a significant relationship between \(y\) and all of the included \(X\)’s: \[\color{blue}{y_i=\beta_0+\beta_1,X_{1i}+ \dots+\beta_K\, X_{Ki}+\varepsilon_i}\]
    \[ \begin{align} \color{red}{H_0}:& \color{red}{\beta_1 =\beta_2=\dots\beta_K=0}\\ \color{red}{H_A}: & \color{red}{\text{ at least one } \beta_j \neq 0 \text{ for }j=1,2,\dots,K} \end{align}\]

  • if the null hypothesis is true then none of our explanatory variables influence \(y\) and our model is of little value

  • note however \(\color{red}{H_0}\) involves \(\color{green}{K}\) restrictions

  • rejection of \(\color{red}{H_0}\) does not tell us which of the included \(\color{green}{X}\) variables are important in determining \(y\)

    • it only tells us that at least one of the included \(\color{green}{X}\)’s is important (statistically)
  • the unrestricted model all of the included \(X\)’s: \[\color{blue}{y_i=\beta_0+\beta_1,X_{1i}+ \dots+\beta_K\, X_{Ki}+\varepsilon_i}\]

  • the restricted model (imposing K restrictions) \[\color{blue}{y_i=\beta_o+\varepsilon_i}\]

Note

The RSS from a model with only a constant is equal to the total sum of squares \(\color{green}{\sum(y_i-\bar{y}^2)}\). This mean \(\color{blue}{RSS_R=TSS}\)

  • we do not need to estimate the restricted model to get \(\color{green} {RSS_R}\) since \(\color{blue}{TSS}\) will be the same in both the restricted and the unrestricted model. Why?

  • the F-statistic becomes \[\color{blue}{F=\dfrac{(RSS_R-RSS_{UR}/M)}{RSS_{UR}/(N-K-1)}=\dfrac{(TSS-RSS)/K}{RSS/(N-K-1)}}\]

  • this F-statistic has \(\color{green}{K}\) numerator df and \(\color{green}{N-K-1}\) denominator df

  • R computes the sample F-statistic for this test of overall significance and reports it in the regression output as F-statistic

  • recall \[\color{blue}{R^2=\dfrac{\sum(\hat{y}_i-\bar{y})^2}{\sum(y_i-\bar{y})^2}= 1- \dfrac{RSS}{TSS}}\]
    so the sample F-statistic for testing the significance of the model becomes: \[\color{green}{F=\dfrac{(TSS-RSS)/K}{RSS/(N-K-1)}= \dfrac{R^2/K}{(1-R^2)/(N-K-1)}}\]
    since \(\color{blue}{(TSS-RSS)=TSS*R^2}\) and \(\color{blue}{RSS=TSS(1-R^2)}\)

Individual or Joint Tests

  • consider \(H_0:\beta_1=\beta_2=0\)

    • why not just perform a t-test for each of the null hypotheses \(H_0:\beta_1=0\) and \(H_0: \beta_2=0\) ?
  • the reason is \(\color{green}{\text{corr}(b_1,b_2)}\) is not necessarily zero so that the F-testing procedure makes allowance for correlations between the OLS estimators

  • the F test is a joint test for whether the pair of values \(\beta_1=0\) and \(\beta_2=0\) are consistent with the data

  • testing \(\beta_1=0\) using a t-test does not take into account the possibility that \(\beta_2=0\) and no allowance is made for \(\color{green}{\text{corr}(b_1,b_2)}\)

Worked Examples

Worked Example 1

  • Using the car package in R
  • Hypothesis testing for a single parameter: t-test and F-test

Worked Example 2

  • Using the car package in R
  • Hypothesis testing the overall significance of a model: F-test

Worked Example 3

  • Using the car package in R
  • Testing joint linear hypotheses: F-test

Additional Example

  • Using the car package in R
  • Testing the overall significance of a model: F-test
  • Testing hypotheses in a quadratic model: t-test and F-test

Model Specification

  • in any econometric analysis, specification if the model is one of the first steps in the econometric methodology

  • three essential features of model specification are

    1. choice of functional form
    2. choice of explanatory variables
      • omission of relevant explanatory variables
      • inclusion of irrelevant variables
    3. examining whether the assumptions of the MLRM hold, and if not which assumptions are violated
  • for items 1 and 2, economic principles and logical reasoning play a prominent role

Functional Form

  • The MLRM does not necessarily restrict the relationship between \(X\) and \(y\) to be linear.

    • Often economic theory implies a non-linear relationship between the variables \(X\) and \(y\).
  • However, it does restrict the way the parameters \(\beta_j\) enter the econometric model

    • The econometric model must be linear in parameters.

    • The parameters \(\beta_j\) cannot be multiplied together, divided, squared, etc.

  • The variables \(X\) and \(y\) can be transformed in any way, as long as the resulting model satisfies the assumptions of the regression model.

General principle

Choose a functional form that is sufficiently flexible to fit the data while preserving the assumptions about the random error term.

Summary

Linear Model
\[\color{blue}{y+\beta_0+\beta_1\,X + \varepsilon}\]
where
\[\color{red}{\beta_1= \dfrac{\Delta E[y|X]}{\Delta X}}\] so \(\color{green}\beta_1\) represents the slope of the conditional mean function.

Log- Linear Model \[\color{blue}{\text{ln}\,y+\beta_0+\beta_1\,X + \varepsilon}\] so
\[\color{red}{(100*\beta_1) \approx \left( \dfrac{\% \Delta E[y|X]}{\Delta X} \right)}\]

so \(\color{blue}{(100*\beta_1)}\) represents the (approximate) percentage change in \(E[y|x]\) associated with a change in the level of \(X\), for a ‘small’ change in X (semi-elasticity)

Log-Linear Model

Model (2): \(\color{red}{b_1=0.07676}\) so an additional year of education raised average wages by 7.68%

Linear-Log Model

Linear-Log Model \[\color{blue}{y= \beta_0+\beta_1\, \text{ln}\,X}\]
so \[\color{red}{\dfrac{\beta_1}{100}= \dfrac{1}{100}* \left( \dfrac{\Delta E[y|X]}{\Delta X/X} \right) = \left(\dfrac{\Delta E[y|X]}{\% \Delta X} \right)}\]
so \(\color{red}{\beta_1/100}\) represents the level change in \(E[y|X]\) associated with a percentage change in the level of \(X\), for a small change in \(X\).

Alternatively, \(\color{blue}{\beta_1}\) then represents the change in \(E[y|X]\) associated with a doubling or 100% change in \(X\).

Log-Log Model

Log-Log Model \[\color{blue}{\text{ln}\,y= \beta_0+\beta_1\, \text{ln}\,X}\]
so \[\color{red}{\beta_1= \dfrac{\Delta E[\text{ln}\,y|X]}{\Delta \text{ln}\,X} \approx \dfrac{\Delta E[y|X/E[y/X]]}{\Delta\, X\X}}\]
so
\[\color{red}{\beta_1 \approx \dfrac{100}{100}* \left( \dfrac{\Delta E[y|X/E[y/X]]}{\Delta\, X\X} \right) = \dfrac{\% \Delta E[y|X]}{\% \Delta X}}\]
so \(\color{blue}{\beta_1}\) represents the (approximate) percentage change in \(E[y|X]\) associated with a percentage change on the level of \(X\).
Note the the parameter \(\beta_1\) can be interpreted as an elasticity.

Example: GDP and Child Labour

Question: What is the relationship between child labour and GDP per-capita?

  • we expect a negative relationship between GDP per-capita and the share of child labour
    • countries with larger GDP per capita will tend to have, on average, a lower share of child labour
  • but it might be a non-linear relationship - the child labour share might decline sharply with GSP per-capita
    • the reduction on child labour share might depend upon the level of GDP per-capita
    • increases on GDP per-capita for low levels of GDP per-capita might have a greater effect upon the child labour share

  • theory suggest a non-linear relationship between GDP per-capita and child labour but does not provide the functional form of the relationship - as GDP per-capita increases, slope of the conditional mean becomes less negative

  • plot of the data suggests negative but non-linear relationship between GDP per-capita and child share

  • plot of data suggests that the following reciprocal econometric model might be appropriate: \[\color{blue}{\text{cshare}_i = \beta_0 + \beta_1 \, \dfrac{1}{\text{gdp}_i}+ \varepsilon_i}\]

  • as gdp (per-capita) \(\rightarrow \infty\), \(E[\text{cshare|gdp}] \rightarrow \beta_0\)

  • slope becomes flatter as GDP per-capita increases:
    \[\color{red} {\dfrac {\Delta E[\text{cshare|gdp}]}{\Delta \text{gdp}} = - \beta_1 \dfrac{1}{\text{gdp}^2}}\]

  • slope depends upon the level of GDP per-capita

  • when \(\beta_1>0\) , slope is negative for all values of GDP per-capita

  • \(\color{blue}{b_1= - 0.0.2696}\) - negative relationship between child share and GDP per-capita
    • additional $100 of income reduces child share by 0.27 percentage points
  • look at the fitted values - linear model predicts negative child share for some ‘wealthy’ countries (red line)

Reciprocal Model

GDP and Child Labour (OLS Residuals)

GDP and Child Labour Reciprocal Model (OLS Residuals)

GDP and Child Labour Reciprocal Model (Fitted Values)

  • \(\color{blue}{b_1>0}\) do slope of the fitted regression line is negative for all values of GDP per-capita

  • Policy Implication: raising GDP per-capita of the ‘poorest’ countries will have the largest effect upon the child labour share

  • look at the fitted values

  • evaluating at mean GDP per-capita of $15,156, average slope is approximately -0.056. Compare to estimated linear effect of -0.2696

\[\color{red}{\dfrac {\Delta E{[\text{cshare|gdp]}}}{\Delta \text{gdp}} = +b_1\, \dfrac{1}{\overline{\text{gdp}}^2}} = - \dfrac{12.8673}{(15.156)^2}=-0.056\]

  • as GDP per-capita \(\rightarrow \infty\), child share \(\rightarrow \approx 3 \%\) - and statistically significantly different from zero.

Choosing a Funtional Form

  • economic theory may not provide enough information to identify which functional form is appropriate

  • several alternative functional forms may be consistent with the restrictions suggested by economic theory

  • we need to choose a functional form that is:

    1. sufficiently flexible to fit the data
    2. while at the same time preservingthe assumptions about the error term

What to do?

  • plot the data - check whether for larger values of \(X\), \(y\) tends to increase (or decrease) at an increasing, constant or decreasing rate. This might give us some indication if the appropriate functional form

  • pick a functional form and plot the residuals - check whether the residuals for the chosen functional form are consistent with zero mean and constant variance random errors.

  • ideally there should be no (systematic) pattern of any sort in the residuals

  • if there does appear to be a systematic pattern, then maybe an alternative functional form is appropriate

  • [next lecture]style=“color:grey;”}: testing for ‘incorrect’ functional form

Simulation Example

Consider the following econometric model
\[\color{blue}{y_i = \beta_0+\beta_1\, X_{1i} + \beta_2\, X_{2i}+ \beta_3\, X_{2i}^2+ \varepsilon_i \qquad \varepsilon_i|X_i \thicksim \mathcal{N}(0,1)}\]

The true values of the parameters are given by: \[ \color{green}{ \begin{align} \beta_0 & = 1 \\ \beta_1 & = 2 \\ \beta_2 & = 3 \\ \beta_3 & = 4 \end{align} } \]

\(X\) is bi-variate normally distributed \[ \color{red}{ \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} \sim \mathcal{N} \left( \begin{matrix} 1 \\ 2 \end{matrix}, \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \right) } \]

with \(\color{blue}{\text{E}(X_1)=1,\text{E}(X_2)=2,\text{VAR}(X_1)=1,\text{VAR}(X_2)=2}\) and \(\color{red}{\text{COV}(X_1,X-2)=0}\) \(y\) will be normally distributed \[ \color{green}{Y_i \thicksim \mathcal{N}\left(\beta_0+\beta_1\, X_{1i}+\beta_2\, X_{2i}+\beta_3\, X_{2i}^2,1 \right)} \]

Suppose instead that we estimate the following ‘incorrect’ model, ignoring the quadratic relationship in \(X_2\): \[\color{green}{y_i=\beta_0+\beta_1\, X_{1i}+ \beta_2\, X_{2i} + \varepsilon_i}\]

We are estimating the wrong functional form, imposing restriction \(\color{green}{\beta_3=0}\).

omitted variable bias: omitted variable \(X_{2i}^2\)

  • \(\color{green}{\beta_3>0}\) and \(\color{green}{\text{COV}(X_2,X_2^2)>0}\)

  • relative to the true model, all of the estimated coefficients in the ‘incorrect’ model will be biased

  • estimate for \(\color{green}{\beta_2}\) will generally be upward biased

  • note that the bias is a property of the estimator. We cannot determine the sign and magnitude of the bias from a single estimate.

Why is \(\text{COV}(X_1,X_2) \neq 0\)

The ‘incorrect’ model can be written as: \[\color{blue}{y_i = \beta_o + \beta_1\, X_{1i}+\beta_2\, X_{2i} + \left \{\beta_3 \, X_{2i}^2 + \varepsilon_i \right\}}\] - the role of the residuals is to capture everything that is not in the model
- The pattern that would otherwise be explained by the true model would be revealed in the residuals

  • Although we can identify the missing pattern by putting the regressors in different functions one-by-one, an efficient way to test if there is any misspecification is to compare what has been explained by the model \(\left( \hat{y}_i \right)\) against what is not explained by the model \(\left( \hat{e} \right)\).

  • our misspecified model has omitted a quadratic term. The scatter plot of the predicted values \(\left( \hat{y}_i \right)\) against the residuals \(\left( \hat{e} \right)\) shows a parabolic relationship.