Longley Economic Regression Data Visualized

aiooo
20.09.2014

Longley Economic Regression Data

A data frame with 7 economical variables, observed yearly from 1947 to 1962 (n=16):

  • Gross National Product
  • GNP deflator
  • number of unemployed
  • number of people in the armed forces
  • noninstitutionalized’ population ≥ 14 years of age
  • number of empoyed
  • year of observation.

This macroeconomic data set which provides a well-known example for a highly collinear regression.

J. W. Longley (1967) An appraisal of least-squares programs from the point of view of the user. Journal of the American Statistical Association 62, 819–841

Collinearity

Collinearity, also called multicollinearity, is a condition where the model's predictors variables are highly intercorrelated. A consequence of this situation is the inability to estimate the model's regression coefficients with acceptable precision. Tehrefore, models with this problem are not considered useful. Heibereger, Holland, Statistical Analysis and Data Display

Collinearity symptoms:

  • high R-squared
  • most of the predictors have regression coefficients close to zero
  • data points congregate close to a straight line

Example of Longley collinear data

 barplot(longley$GNP, 
            col="#377EB8",
            main="GNP",
            ylab="GNP",
            xlab="Year",
            names.arg = longley$Year
            )

plot of chunk unnamed-chunk-2

More on the website