LEVEL-LEVEL
perf1 <- lm(math10 ~ lnchprg, data = meap93)
summary(perf1)
##
## Call:
## lm(formula = math10 ~ lnchprg, data = meap93)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.386 -5.979 -1.207 4.865 45.845
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 32.14271 0.99758 32.221 <2e-16 ***
## lnchprg -0.31886 0.03484 -9.152 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.566 on 406 degrees of freedom
## Multiple R-squared: 0.171, Adjusted R-squared: 0.169
## F-statistic: 83.77 on 1 and 406 DF, p-value: < 2.2e-16
The estimated regression equation is \[
\widehat{math10}= 32.143-0.319lnchprg \]
This means that if the percentage of students who are eligible for the
lunch program increases, the percentage of 10th graders receiving a
passing score in a standardized math exam decreases by 0.319 holding
other factors equal.
\(R^2=0.171\): Thus, this means that
the percentage of students who are eligible for the lunch program can
explain about 17.1% variation in the percentage of 10th graders
receiving a passing score in a standardized math exam.
\(RMSE=9.566\) This is not a desirable
value since it is far from 0.
Based on the \(R^2\) and RMSE, the
model does not provide a good fit to the data.
LOG-LEVEL
perf2 <- lm(log(math10)~lnchprg, data = meap93)
summary(perf2)
##
## Call:
## lm(formula = log(math10) ~ lnchprg, data = meap93)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.34067 -0.22219 0.03436 0.27521 1.29532
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.497277 0.046338 75.47 <2e-16 ***
## lnchprg -0.016734 0.001618 -10.34 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4443 on 406 degrees of freedom
## Multiple R-squared: 0.2085, Adjusted R-squared: 0.2065
## F-statistic: 106.9 on 1 and 406 DF, p-value: < 2.2e-16
The estimated regression equation is \[\widehat{log(math10)}=
3.497-0.017lnchprg\]
This means that an increase in the percentage of students who are
eligible for the lunch program is predicted to decrease the percentage
of 10th graders receiving a passing score in a standardized math exam by
1.7% holding other factors equal.
\(R^2=0.2085\): Thus, the percentage of
students who are eligible for the lunch program explains about 20.85% of
the variation in log percentage of 10th graders receiving a passing
score in a standardized math exam..
\(RMSE=0.4443\) This is not really a
desirable value since it is far from 0.
Based on the \(R^2\) and RMSE, the
model does not provide a good fit to the data.
LEVEL-LOG
perf3 <- lm(math10~log(lnchprg), data = meap93)
summary(perf3)
##
## Call:
## lm(formula = math10 ~ log(lnchprg), data = meap93)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.336 -6.253 -1.417 4.724 46.218
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 45.6269 2.2732 20.072 <2e-16 ***
## log(lnchprg) -7.0500 0.7287 -9.675 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.471 on 406 degrees of freedom
## Multiple R-squared: 0.1874, Adjusted R-squared: 0.1854
## F-statistic: 93.6 on 1 and 406 DF, p-value: < 2.2e-16
The estimated regression equation is: \[
\widehat{math10}= 45.627-7.050log(lnchprg)\]
This means that a 1% increase in the percentage of students who are
eligible for the lunch program is associated with 0.0705 decrease in
percentage of 10th graders receiving a passing score in a standardized
math exam holding other factors equal.
\(R^2=0.1874\) Thus the log of the
percentage of students who are eligible for the lunch program can
explain about 18.74% of the variation in percentage of 10th graders
receiving a passing score in a standardized math exam.
\(RMSE=9.471\) This is not a desirable
value since it is far from 0.
Based on the \(R^2\) and RMSE, the
model does not provide a good fit to the data.
LOG-LOG
perf4 <- lm(log(math10)~log(lnchprg), data = meap93)
summary(perf4)
##
## Call:
## lm(formula = log(math10) ~ log(lnchprg), data = meap93)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.27639 -0.22457 0.03033 0.25315 1.29443
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.08338 0.10842 37.661 <2e-16 ***
## log(lnchprg) -0.33017 0.03476 -9.499 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4518 on 406 degrees of freedom
## Multiple R-squared: 0.1818, Adjusted R-squared: 0.1798
## F-statistic: 90.24 on 1 and 406 DF, p-value: < 2.2e-16
The estimated regression equation is: \[\widehat{log(math10)}=
4.083-0.330log(lnchprg)\]
This means that a 1% increase in the percentage of students who are
eligible for the lunch program decreases the percentage of 10th graders
receiving a passing score in a standardized math exam by 0.33% holding
other factors equal.
\(R^2=0.1818\) Thus,the log of the
percentage of students who are eligible for the lunch program can
explain about 18.18% of the variation in log of the percentage of 10th
graders receiving a passing score in a standardized math exam.
\(RMSE=0.4518\) This is not really a
desirable value since it is far from 0.
Based on the \(R^2\) and RMSE, the
model does not provide a good fit to the data.
plot(meap93$lnchprg,meap93$math10,
col="skyblue",
pch=20,
xlab = "students eligible for lunch program (%)",
ylab= "10th graders recieving a passing score (%)",
main = "")
abline(perf1, col="red", lwd=2)
text(x=60, y=20, "level-level", col="red")
abline(perf2, col="yellow", lwd=2)
text(x=40, y=5, "log-level", col="yellow")
abline(perf3, col="green", lwd=2)
text(x=5, y=50, "level-log", col="green")
abline(perf4, col="purple", lwd=2)
text(x=5, y=5, "log-log", col="purple")