Using the data in GPA2, the following equation was estimated: sat c = 1, 028.10 + 19.30hsize − 2.19hsize2 − 45.09female − 169.81black +62.31female · black n = 4, 137, R2 = .0858. The variable sat is the combined SAT score; hsize is size of the student’s high school graduating class, in hundreds; female is a gender dummy variable; and black is a race dummy variable equal to one for blacks, and zero otherwise.
From this equation we don’t have the information about SE to calculate the significance of the coeficientes so we can not be sure. However R square is very small so seems that the equation is not very good.
To calculate the optimal size we need to take the first derivative 19.30hsize − 2.19hsize2 the result is 4.4
For that we just need to use white female (female =1, black = 0) − 45.09female +62.31female = 17.22. This is just a simple comparison between white male and female
− 169.81black
− 169.81black +62.31female · black = -107.5
(page 232) colGPA 5 b0 1 d0PC 1 b1hsGPA 1 b2ACT 1 u
colGPA 5 1.26 1 .157 PC 1 .447 hsGPA 1 .0087 ACT (.33) (.057) (.094) (.0105) n 5 141, R2 5 .219.
###(Hint: Write P C = 1 − noP C and plug this into the equation colGP A = βˆ 0 + ˆδ0P C + βˆ 1hsGP A + βˆ 2ACT.)
1.26+0.157 = 1.417
1.417 - 0.157noPC
Shoiuld remain the same
No because we will have multicollinearity problem
f1(z) = β0 + δ0d + β1z + δ1d.z
This is the equivalent of the first derivative, as we know that will maximize the function, in this case this will be the intersection between the lines. It has to have opposite sign so one can increase faster than the other.
For that we need to take derivative in respect of Totcoll we will have -0.357 + 0.03Totcoll = 0
0.357/0.03 = 11.9
We can now test by pluggin this number back in the equation log (wage) = 2.289 − .357female + .50totcoll + .030female · totcoll 2.289 − 0.357 + (0.5011.9) + (0.03011.9) = 8.239 for female 2.289 + (0.50*11.9) = 8.239 for male
No, because the amount of years is above the piratical, also it is very likely that age influence wage and woman would be to old at the time they get enough college years.
loan <- data.frame(loanapp)
The signial would be positive
fit.1 <- lm(approve ~ white, data = loan )
summary(fit.1)
##
## Call:
## lm(formula = approve ~ white, data = loan)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.90839 0.09161 0.09161 0.09161 0.29221
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.70779 0.01824 38.81 <2e-16 ***
## white 0.20060 0.01984 10.11 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3201 on 1987 degrees of freedom
## Multiple R-squared: 0.04893, Adjusted R-squared: 0.04845
## F-statistic: 102.2 on 1 and 1987 DF, p-value: < 2.2e-16
According to this result if a person is white it has 0.2 more changes of being approved for a loan
fit.2 <- lm(approve ~ white+ hrat+ obrat+ loanprc+ unem+ male+ married+ dep+ sch+ cosign+ chist+ pubrec+ mortlat1+ mortlat2+vr, data = loan )
summary(fit.2)
##
## Call:
## lm(formula = approve ~ white + hrat + obrat + loanprc + unem +
## male + married + dep + sch + cosign + chist + pubrec + mortlat1 +
## mortlat2 + vr, data = loan)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.06482 0.00781 0.06387 0.13673 0.71105
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.936731 0.052735 17.763 < 2e-16 ***
## white 0.128820 0.019732 6.529 8.44e-11 ***
## hrat 0.001833 0.001263 1.451 0.1469
## obrat -0.005432 0.001102 -4.930 8.92e-07 ***
## loanprc -0.147300 0.037516 -3.926 8.92e-05 ***
## unem -0.007299 0.003198 -2.282 0.0226 *
## male -0.004144 0.018864 -0.220 0.8261
## married 0.045824 0.016308 2.810 0.0050 **
## dep -0.006827 0.006701 -1.019 0.3084
## sch 0.001753 0.016650 0.105 0.9162
## cosign 0.009772 0.041139 0.238 0.8123
## chist 0.133027 0.019263 6.906 6.72e-12 ***
## pubrec -0.241927 0.028227 -8.571 < 2e-16 ***
## mortlat1 -0.057251 0.050012 -1.145 0.2525
## mortlat2 -0.113723 0.066984 -1.698 0.0897 .
## vr -0.031441 0.014031 -2.241 0.0252 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3021 on 1955 degrees of freedom
## (18 observations deleted due to missingness)
## Multiple R-squared: 0.1656, Adjusted R-squared: 0.1592
## F-statistic: 25.86 on 15 and 1955 DF, p-value: < 2.2e-16
The coefficient become smaller, but still positive and significant.
nba <- data.frame(nbasal)
?nbasal
## starting httpd help server ... done
fit.3 <- lm(points~ exper+expersq+ guard+ forward, data=nba)
summary(fit.3)
##
## Call:
## lm(formula = points ~ exper + expersq + guard + forward, data = nba)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.220 -4.268 -1.003 3.444 22.265
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.76076 1.17862 4.039 7.03e-05 ***
## exper 1.28067 0.32853 3.898 0.000123 ***
## expersq -0.07184 0.02407 -2.985 0.003106 **
## guard 2.31469 1.00036 2.314 0.021444 *
## forward 1.54457 1.00226 1.541 0.124492
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.668 on 264 degrees of freedom
## Multiple R-squared: 0.09098, Adjusted R-squared: 0.07721
## F-statistic: 6.606 on 4 and 264 DF, p-value: 4.426e-05
No because we will have multicollinearity problem
guard = 2.31469 more
fit.4 <- lm(points~ exper+expersq+ guard+ forward+marr, data=nba)
summary(fit.4)
##
## Call:
## lm(formula = points ~ exper + expersq + guard + forward + marr,
## data = nba)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.874 -4.227 -1.251 3.631 22.412
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.70294 1.18174 3.980 8.93e-05 ***
## exper 1.23326 0.33421 3.690 0.000273 ***
## expersq -0.07037 0.02416 -2.913 0.003892 **
## guard 2.28632 1.00172 2.282 0.023265 *
## forward 1.54091 1.00298 1.536 0.125660
## marr 0.58427 0.74040 0.789 0.430751
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.672 on 263 degrees of freedom
## Multiple R-squared: 0.09313, Adjusted R-squared: 0.07588
## F-statistic: 5.401 on 5 and 263 DF, p-value: 9.526e-05
Not really because the coefficient is not statistically significant.
fit.4 <- lm(points~ exper+expersq+ guard+ forward+marr+I(marr*exper)+ I(marr*expersq), data=nba)
summary(fit.4)
##
## Call:
## lm(formula = points ~ exper + expersq + guard + forward + marr +
## I(marr * exper) + I(marr * expersq), data = nba)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.239 -4.328 -1.067 3.742 22.197
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.81615 1.34878 4.312 2.29e-05 ***
## exper 0.70255 0.43405 1.619 0.1067
## expersq -0.02950 0.03267 -0.903 0.3674
## guard 2.25079 1.00002 2.251 0.0252 *
## forward 1.62915 1.00199 1.626 0.1052
## marr -2.53750 2.03822 -1.245 0.2143
## I(marr * exper) 1.27965 0.68229 1.876 0.0618 .
## I(marr * expersq) -0.09359 0.04887 -1.915 0.0566 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.654 on 261 degrees of freedom
## Multiple R-squared: 0.1058, Adjusted R-squared: 0.08184
## F-statistic: 4.413 on 7 and 261 DF, p-value: 0.0001188
These are not significant coefficient, so no evidence for that
fit.5 <- lm(assists~ exper+expersq+ guard+ forward+marr+I(marr*exper)+ I(marr*expersq), data=nba)
summary(fit.5)
##
## Call:
## lm(formula = assists ~ exper + expersq + guard + forward + marr +
## I(marr * exper) + I(marr * expersq), data = nba)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.2472 -1.1361 -0.2986 0.7388 8.3042
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.129347 0.406562 -0.318 0.75063
## exper 0.368883 0.130834 2.819 0.00518 **
## expersq -0.018658 0.009848 -1.895 0.05925 .
## guard 2.499510 0.301436 8.292 5.95e-15 ***
## forward 0.448880 0.302028 1.486 0.13843
## marr 0.081164 0.614377 0.132 0.89500
## I(marr * exper) 0.163866 0.205662 0.797 0.42631
## I(marr * expersq) -0.016172 0.014731 -1.098 0.27328
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.704 on 261 degrees of freedom
## Multiple R-squared: 0.3543, Adjusted R-squared: 0.3369
## F-statistic: 20.45 on 7 and 261 DF, p-value: < 2.2e-16
No the results remain not significant