library(alr4)
## Loading required package: car
## Loading required package: carData
## Loading required package: effects
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
library(smss)
y = -10536 + (53.8*1240) + (2.84*18000)
y
## [1] 107296
145000 - y
## [1] 37704
B. It would increase by the coefficient of x1, which is the changing variable. the other variable is fixed.
53.8/2.84
## [1] 18.94366
2A. Lookling at the result of a t test, the output suggests there is a difference in salary. T p value is larger than .05, which is evidence against rejecting the null.
t.test(salary ~ sex, data = salary)
##
## Welch Two Sample t-test
##
## data: salary by sex
## t = 1.7744, df = 21.591, p-value = 0.09009
## alternative hypothesis: true difference in means between group Male and group Female is not equal to 0
## 95 percent confidence interval:
## -567.8539 7247.1471
## sample estimates:
## mean in group Male mean in group Female
## 24696.79 21357.14
2B & C. Sex is showed to have a large p value, and as such is not statistically significant. This is the case, too, for degree. RankAssoc and rank prof shows to have a stastically significant p value to reject the null. Rank Assoc shows that there is an increase of salary to 5292.36 and rank prof 11118.76$ increase. Year has a small p value, and as such we can reject the null. Expected increase of 476.31. Ysdeg has a p value that is too large to reject the null.
reg<-lm(salary ~ sex + degree + rank + year + ysdeg, data = salary)
confint(reg)
## 2.5 % 97.5 %
## (Intercept) 14134.4059 17357.68946
## sexFemale -697.8183 3030.56452
## degreePhD -663.2482 3440.47485
## rankAssoc 2985.4107 7599.31080
## rankProf 8396.1546 13841.37340
## year 285.1433 667.47476
## ysdeg -280.6397 31.49105
summary(reg)
##
## Call:
## lm(formula = salary ~ sex + degree + rank + year + ysdeg, data = salary)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4045.2 -1094.7 -361.5 813.2 9193.1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15746.05 800.18 19.678 < 2e-16 ***
## sexFemale 1166.37 925.57 1.260 0.214
## degreePhD 1388.61 1018.75 1.363 0.180
## rankAssoc 5292.36 1145.40 4.621 3.22e-05 ***
## rankProf 11118.76 1351.77 8.225 1.62e-10 ***
## year 476.31 94.91 5.018 8.65e-06 ***
## ysdeg -124.57 77.49 -1.608 0.115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2398 on 45 degrees of freedom
## Multiple R-squared: 0.855, Adjusted R-squared: 0.8357
## F-statistic: 44.24 on 6 and 45 DF, p-value: < 2.2e-16
2D. Running this changes the relational value of salary for variables previously reviewed as having a positve increase. For example, rank assist now shows a decrease in value to 5292.
salary$rank <- relevel(salary$rank, ref ='Assoc')
reg2<-lm(salary ~ sex + degree + rank + year + ysdeg, data = salary)
summary(reg2)
##
## Call:
## lm(formula = salary ~ sex + degree + rank + year + ysdeg, data = salary)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4045.2 -1094.7 -361.5 813.2 9193.1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 21038.41 1109.12 18.969 < 2e-16 ***
## sexFemale 1166.37 925.57 1.260 0.214
## degreePhD 1388.61 1018.75 1.363 0.180
## rankAsst -5292.36 1145.40 -4.621 3.22e-05 ***
## rankProf 5826.40 1012.93 5.752 7.28e-07 ***
## year 476.31 94.91 5.018 8.65e-06 ***
## ysdeg -124.57 77.49 -1.608 0.115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2398 on 45 degrees of freedom
## Multiple R-squared: 0.855, Adjusted R-squared: 0.8357
## F-statistic: 44.24 on 6 and 45 DF, p-value: < 2.2e-16
2E. Doing this changes the P value for all variables to significant, excepting sex. The income values are generally smaller than previous models, with degree and sex being negative.
norank<-lm(salary ~ sex + year + degree + ysdeg, data = salary)
summary(norank)
##
## Call:
## lm(formula = salary ~ sex + year + degree + ysdeg, data = salary)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8146.9 -2186.9 -491.5 2279.1 11186.6
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17183.57 1147.94 14.969 < 2e-16 ***
## sexFemale -1286.54 1313.09 -0.980 0.332209
## year 351.97 142.48 2.470 0.017185 *
## degreePhD -3299.35 1302.52 -2.533 0.014704 *
## ysdeg 339.40 80.62 4.210 0.000114 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3744 on 47 degrees of freedom
## Multiple R-squared: 0.6312, Adjusted R-squared: 0.5998
## F-statistic: 20.11 on 4 and 47 DF, p-value: 1.048e-09
2F. Shows a negative value here, with a slightly insignificant p value.
salary$hire<-ifelse(salary$year <= 15, 1, 0)
hired<-lm(salary ~ hire + rank + sex + degree, data = salary)
summary(hired)
##
## Call:
## lm(formula = salary ~ hire + rank + sex + degree, data = salary)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5337.6 -1742.8 -509.1 1412.2 9552.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25617.3 1803.3 14.205 < 2e-16 ***
## hire -2898.0 1467.0 -1.975 0.0542 .
## rankAsst -5080.4 1155.5 -4.397 6.44e-05 ***
## rankProf 6251.6 1130.4 5.530 1.46e-06 ***
## sexFemale -478.6 969.2 -0.494 0.6238
## degreePhD 816.7 921.9 0.886 0.3803
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2905 on 46 degrees of freedom
## Multiple R-squared: 0.7827, Adjusted R-squared: 0.759
## F-statistic: 33.13 on 5 and 46 DF, p-value: 3.589e-14
3A. Both variables have a low enough p value to reject the null. Size has an increasing value of 116 and new has an expected value of 57736
data("house.selling.price")
house<-lm(Price ~ Size + New, data = house.selling.price)
summary(house)
##
## Call:
## lm(formula = Price ~ Size + New, data = house.selling.price)
##
## Residuals:
## Min 1Q Median 3Q Max
## -205102 -34374 -5778 18929 163866
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -40230.867 14696.140 -2.738 0.00737 **
## Size 116.132 8.795 13.204 < 2e-16 ***
## New 57736.283 18653.041 3.095 0.00257 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 53880 on 97 degrees of freedom
## Multiple R-squared: 0.7226, Adjusted R-squared: 0.7169
## F-statistic: 126.3 on 2 and 97 DF, p-value: < 2.2e-16
ลท = -40230.867 + 116.132x1 + 57736.283x2 is the prediction equation based on the output from the lm model above. x1 would be the size of the house, x2 would be the value of the house as new or old.
new<--40230.867 + (116.132 * 3000) + (57736.283)
new
## [1] 365901.4
old<- -40230.867 + (116.132 * 3000)
old
## [1] 308165.1
interaction<-lm(Price ~ Size + New + Size*New, data = house.selling.price)
summary(interaction)
##
## Call:
## lm(formula = Price ~ Size + New + Size * New, data = house.selling.price)
##
## Residuals:
## Min 1Q Median 3Q Max
## -175748 -28979 -6260 14693 192519
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -22227.808 15521.110 -1.432 0.15536
## Size 104.438 9.424 11.082 < 2e-16 ***
## New -78527.502 51007.642 -1.540 0.12697
## Size:New 61.916 21.686 2.855 0.00527 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 52000 on 96 degrees of freedom
## Multiple R-squared: 0.7443, Adjusted R-squared: 0.7363
## F-statistic: 93.15 on 3 and 96 DF, p-value: < 2.2e-16
New model is y = -22227.9 + 104.438x + -78527.5 + 61.9x1x2 old is y = -22227.9 + 104.4x
new1<-22227.808 + (104.438 * 3000) - (78527.502 * 1) + (61.916 * 3000 * 1)
new1
## [1] 442762.3
old1<--22227.808 + (104.438 * 3000)
old1
## [1] 291086.2
new2<--22227.808 + (104.438 * 1500) - (78527.502 * 1) + (61.916 * 1500 * 1)
new2
## [1] 148775.7
old2<--22227.808 + (104.438 * 1500)
old2
## [1] 134429.2