NLSdata <- read.csv("/Users/dgkamper/Library/Mobile Documents/com~apple~CloudDocs/Axis - HQ/PhD Terms/Classes/Spring 2024/Psych 250C/Problem Sets/Psych 250C Problems Sets/NLSdata.csv")

Question 3



model_lab4  <- lm (Income2008 ~ YearBorn + ASVAB, data = NLSdata)

summary (model_lab4)
## 
## Call:
## lm(formula = Income2008 ~ YearBorn + ASVAB, data = NLSdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -44998 -13702  -2832   9390 104029 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.689e+06  4.582e+05   10.23   <2e-16 ***
## YearBorn    -2.355e+03  2.312e+02  -10.19   <2e-16 ***
## ASVAB        1.997e-01  1.120e-02   17.83   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20860 on 4221 degrees of freedom
## Multiple R-squared:  0.09049,    Adjusted R-squared:  0.09006 
## F-statistic:   210 on 2 and 4221 DF,  p-value: < 2.2e-16
library (car)
## Loading required package: carData
# Histogram

hist(residuals(model_lab4), breaks = "Sturges", main = "Histogram of Residuals", xlab = "Residuals")

#QQPlot

qqPlot(model_lab4, main="QQ Plot of Residuals")

## [1]  952 3152

In this QQ Plot, there is a deviance in the tails, but this largely follows the reference line in the center of the plot. The lower tail drops below and the upper tail rising above the reference line. This indicates that the residuals have heavier tails than a normal distribution, suggesting that this data contains more extreme values than would be expected if the residuals were perfectly normally distributed.