The given dataset was computed from a sample of 67,248 New Hampshire residents at the age of 25-65. The sample data was obtained from the U.S. Census, 2012-2016 ACS PUMS DATA.
The Grafton and Coos Counties is the 3rd line, the average years of schooling for the residents in these counties is, 18.52291 years. The median income for these counties is, $30,000. The region indicates a 0 because these are not southeastern regions.
Hint: Make sure to interpret the direction and the magnitude of the relationship. In addition, keep in mind that correlation (or regression) coefficients do not show causation but only association.
## [1] 0.8622811
The correlation between the two variables is a strong positive relationship because the absolute value is greater than .6 which is 0.86
##
## Call:
## lm(formula = income_median ~ ed_avg, data = residents_25to65)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3643.9 -2548.6 655.8 1730.7 4150.6
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -201503 49675 -4.056 0.00365 **
## ed_avg 12695 2636 4.816 0.00133 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2891 on 8 degrees of freedom
## Multiple R-squared: 0.7435, Adjusted R-squared: 0.7115
## F-statistic: 23.19 on 1 and 8 DF, p-value: 0.001328
Hint: Discuss your answer in terms of the number of stars in the summary result. Refer to the interpretation section in quiz4_a.
Yes, it is statistically significant at 5% because there are 2 stars, which is 0.001. It means that we are 99.9% confident that the interecept is true
##
## Call:
## lm(formula = income_median ~ ed_avg + region, data = residents_25to65)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2016.2 -778.4 -373.5 353.4 2780.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -166192 30700 -5.413 0.000994 ***
## ed_avg 10701 1638 6.532 0.000324 ***
## region 4524 1136 3.981 0.005314 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1711 on 7 degrees of freedom
## Multiple R-squared: 0.9214, Adjusted R-squared: 0.899
## F-statistic: 41.05 on 2 and 7 DF, p-value: 0.0001359
Hint: Discuss your answer by comparing the residual standard error and the adjusted R squared between the two models.
The model 2 is the better fit for the data because the standard of error is smaller in model 2 at 1,711 and the adjusted R-Square is large at 0.899.
Hint: Note that the second model has two predictors. Use both predictors to compute the predicted income.
-166,192 + 10,701(18.52291) = $32,021.65
The median income from the second model predicts for Grafton and Coos Coounties is, $32,021.65
Hint: Discuss your answer based on the coefficient of region. You may refer to the interpretation section in quiz4_a.
The southeastern regions would make more income because their region indicats 1 not 0 like the Grafton and Coos county.