Load data from the file Griliches.csv
. See the full description here and save it as wages
.
Choose columns lw
(natural logaritm of wage), expr
(experience in years), age
(age in years) and iq
and save them as small
.
Plot a matrix of scatterplots for the pairs of variables chosen before. Which variables are correlated positively? Negatively?
Run a multiple linear regression model that will show how the wage is affected by people’s age, experience and IQ level.
4.1. Run the model in R and provide your code used.
4.2. Write the equation of the model using coefficients you obtained in R.
4.3. Which of the variables in the model affect wages significantly if we consider 5% level of significance?
4.4. Interpret the coefficient of expr
.
4.5. How does wage change when age increases by one year (on average, all else equal)?
Report the \(R^2\) of this model. How do you feel, is this model acceptable (by quality)?
Plot any graphs for residuals of this model. Can you conclude that there are some non-linear patterns in residuals (and, hence, in relationships between variables)?