Questions: 1) Compare results reported on page 4.14 and 4.18 of the lecture slides, write a few paragraphs to describe the differences. 2) In both models, if you introduce a second independent variable age, what do you think the relationship between age and income might be?
On both slides, the rows are the same. There is Estimate, Standard Error, T Value and Pr. On slide 4.14, the parameters are sigma, beta1 and beta2. Sigma is the standard deviation for this analysis. But the numbers for sigma are not important for analyzing this information. The value of -0.65207 in beta1 shows the Y-intercept. The Y-intercept just shows where the regression line will start meaning that the line will start at (0,-0.65207). The important part for this analysis comes from beta2 which represents the income. Beta2 shows the slope of the regression line. The slope shows that for every 1 unit increase in education that there is a 0.37613 increase in income. The standard error shows the average of the residual values. The residual values come from the differences between the predictor line and the actual values. Another important part of this analysis is the Pr which is the level of statistical significance. Statistical significance shows how meaningful the variables are to the study. The lower the Pr value, the greater the statistical significance. So the value of 2e-16 for Pr for beta2 means that the variable of income is statistically significant. This slide was looking for the mu or the average.
On slide 4.18, the function was looking for the standard deviation. The parameters are mu, theta1 and theta2. Mu represents the known average which is used to find the standard deviation. The value on theta1 (1.461011) represents the Y-intercept which is the point at which the regression line starts. The value of theta2 (0.109081) represents the slope which means for every 1 unit increase of education there is an increase in income by 0.109081 in the standard deviation. This increase in standard deviation means that as education increases the distance between the lowest and highest point gets greater. This difference in the lowest and highest point of income per level of education means that at the highest level level of education there is a high level of variability. A high level of variablility means that people with many years of education have a higher chance of having a high income or a low income compared to those with lower levels of education.
If age was added to the two models, the numbers for mu and sigma will change. The two models only look at the relationship between income and education so when age is introduced to the models; age will bring down the relationship between the two original variables. The two models were only considering education and its affect on income but now when age is added it will considered the three variables together. The relationship between age and income could be a negative one because as people get older they reach a maximum amount of money they will make and then they will start to decline in income as they get older. So there will be an increase in income when people’s ages range from 30-50 but after that they will see a decline in their income.