library(alr4)
## Loading required package: car
## Loading required package: carData
## Loading required package: effects
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
library(smss)
y = -10536 + (53.8*1240) + (2.84*18000)
y
## [1] 107296
  1. I plugged the values into the given equation to get the predicted value. This value was subtracted by the actual selling point to give a 37704 residual.
145000 - y
## [1] 37704

B. It would increase by the coefficient of x1, which is the changing variable. the other variable is fixed.

53.8/2.84
## [1] 18.94366

2A. Lookling at the result of a t test, the output suggests there is a difference in salary. T p value is larger than .05, which is evidence against rejecting the null.

t.test(salary ~ sex, data = salary)
## 
##  Welch Two Sample t-test
## 
## data:  salary by sex
## t = 1.7744, df = 21.591, p-value = 0.09009
## alternative hypothesis: true difference in means between group Male and group Female is not equal to 0
## 95 percent confidence interval:
##  -567.8539 7247.1471
## sample estimates:
##   mean in group Male mean in group Female 
##             24696.79             21357.14

2B & C. Sex is showed to have a large p value, and as such is not statistically significant. This is the case, too, for degree. RankAssoc and rank prof shows to have a stastically significant p value to reject the null. Rank Assoc shows that there is an increase of salary to 5292.36 and rank prof 11118.76$ increase. Year has a small p value, and as such we can reject the null. Expected increase of 476.31. Ysdeg has a p value that is too large to reject the null.

reg<-lm(salary ~ sex + degree + rank + year + ysdeg, data = salary)
confint(reg)
##                  2.5 %      97.5 %
## (Intercept) 14134.4059 17357.68946
## sexFemale    -697.8183  3030.56452
## degreePhD    -663.2482  3440.47485
## rankAssoc    2985.4107  7599.31080
## rankProf     8396.1546 13841.37340
## year          285.1433   667.47476
## ysdeg        -280.6397    31.49105
summary(reg)
## 
## Call:
## lm(formula = salary ~ sex + degree + rank + year + ysdeg, data = salary)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4045.2 -1094.7  -361.5   813.2  9193.1 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 15746.05     800.18  19.678  < 2e-16 ***
## sexFemale    1166.37     925.57   1.260    0.214    
## degreePhD    1388.61    1018.75   1.363    0.180    
## rankAssoc    5292.36    1145.40   4.621 3.22e-05 ***
## rankProf    11118.76    1351.77   8.225 1.62e-10 ***
## year          476.31      94.91   5.018 8.65e-06 ***
## ysdeg        -124.57      77.49  -1.608    0.115    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2398 on 45 degrees of freedom
## Multiple R-squared:  0.855,  Adjusted R-squared:  0.8357 
## F-statistic: 44.24 on 6 and 45 DF,  p-value: < 2.2e-16

2D. Running this changes the relational value of salary for variables previously reviewed as having a positve increase. For example, rank assist now shows a decrease in value to 5292.

salary$rank <- relevel(salary$rank, ref ='Assoc')
reg2<-lm(salary ~ sex + degree + rank + year + ysdeg, data = salary)
summary(reg2)
## 
## Call:
## lm(formula = salary ~ sex + degree + rank + year + ysdeg, data = salary)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4045.2 -1094.7  -361.5   813.2  9193.1 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 21038.41    1109.12  18.969  < 2e-16 ***
## sexFemale    1166.37     925.57   1.260    0.214    
## degreePhD    1388.61    1018.75   1.363    0.180    
## rankAsst    -5292.36    1145.40  -4.621 3.22e-05 ***
## rankProf     5826.40    1012.93   5.752 7.28e-07 ***
## year          476.31      94.91   5.018 8.65e-06 ***
## ysdeg        -124.57      77.49  -1.608    0.115    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2398 on 45 degrees of freedom
## Multiple R-squared:  0.855,  Adjusted R-squared:  0.8357 
## F-statistic: 44.24 on 6 and 45 DF,  p-value: < 2.2e-16

2E. Doing this changes the P value for all variables to significant, excepting sex. The income values are generally smaller than previous models, with degree and sex being negative.

norank<-lm(salary ~ sex + year + degree + ysdeg, data = salary)
summary(norank)
## 
## Call:
## lm(formula = salary ~ sex + year + degree + ysdeg, data = salary)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8146.9 -2186.9  -491.5  2279.1 11186.6 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 17183.57    1147.94  14.969  < 2e-16 ***
## sexFemale   -1286.54    1313.09  -0.980 0.332209    
## year          351.97     142.48   2.470 0.017185 *  
## degreePhD   -3299.35    1302.52  -2.533 0.014704 *  
## ysdeg         339.40      80.62   4.210 0.000114 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3744 on 47 degrees of freedom
## Multiple R-squared:  0.6312, Adjusted R-squared:  0.5998 
## F-statistic: 20.11 on 4 and 47 DF,  p-value: 1.048e-09

2F. Shows a negative value here, with a slightly insignificant p value.

salary$hire<-ifelse(salary$year <= 15, 1, 0)
hired<-lm(salary ~ hire + rank + sex + degree, data = salary)
summary(hired)
## 
## Call:
## lm(formula = salary ~ hire + rank + sex + degree, data = salary)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5337.6 -1742.8  -509.1  1412.2  9552.7 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  25617.3     1803.3  14.205  < 2e-16 ***
## hire         -2898.0     1467.0  -1.975   0.0542 .  
## rankAsst     -5080.4     1155.5  -4.397 6.44e-05 ***
## rankProf      6251.6     1130.4   5.530 1.46e-06 ***
## sexFemale     -478.6      969.2  -0.494   0.6238    
## degreePhD      816.7      921.9   0.886   0.3803    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2905 on 46 degrees of freedom
## Multiple R-squared:  0.7827, Adjusted R-squared:  0.759 
## F-statistic: 33.13 on 5 and 46 DF,  p-value: 3.589e-14

3A. Both variables have a low enough p value to reject the null. Size has an increasing value of 116 and new has an expected value of 57736

data("house.selling.price")
house<-lm(Price ~ Size + New, data = house.selling.price)
summary(house)
## 
## Call:
## lm(formula = Price ~ Size + New, data = house.selling.price)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -205102  -34374   -5778   18929  163866 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -40230.867  14696.140  -2.738  0.00737 ** 
## Size           116.132      8.795  13.204  < 2e-16 ***
## New          57736.283  18653.041   3.095  0.00257 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 53880 on 97 degrees of freedom
## Multiple R-squared:  0.7226, Adjusted R-squared:  0.7169 
## F-statistic: 126.3 on 2 and 97 DF,  p-value: < 2.2e-16
  1. ลท = -40230.867 + 116.132x1 + 57736.283x2 is the prediction equation based on the output from the lm model above. x1 would be the size of the house, x2 would be the value of the house as new or old.

new<--40230.867 + (116.132 * 3000) + (57736.283)
new
## [1] 365901.4
old<- -40230.867 + (116.132 * 3000)
old
## [1] 308165.1
  1. Size and new show a loe p value that rejects the null, as well as size alone. value for size from this model is 104 dollars and size*new is 61 dollars
interaction<-lm(Price ~ Size + New + Size*New, data = house.selling.price)
summary(interaction)
## 
## Call:
## lm(formula = Price ~ Size + New + Size * New, data = house.selling.price)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -175748  -28979   -6260   14693  192519 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -22227.808  15521.110  -1.432  0.15536    
## Size           104.438      9.424  11.082  < 2e-16 ***
## New         -78527.502  51007.642  -1.540  0.12697    
## Size:New        61.916     21.686   2.855  0.00527 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 52000 on 96 degrees of freedom
## Multiple R-squared:  0.7443, Adjusted R-squared:  0.7363 
## F-statistic: 93.15 on 3 and 96 DF,  p-value: < 2.2e-16
  1. New model is y = -22227.9 + 104.438x + -78527.5 + 61.9x1x2 old is y = -22227.9 + 104.4x

new1<-22227.808 + (104.438 * 3000) - (78527.502 * 1) + (61.916 * 3000 * 1)
new1
## [1] 442762.3
old1<--22227.808 + (104.438 * 3000)
old1
## [1] 291086.2
  1. Value changes by roughyl 14,346 dollars based on new and old factors.
new2<--22227.808 + (104.438 * 1500) - (78527.502 * 1) + (61.916 * 1500 * 1)
new2
## [1] 148775.7
old2<--22227.808 + (104.438 * 1500)
old2
## [1] 134429.2
  1. The interaction model shows a lower p value for the values in the tables. This results in more confidence that these variables have an impact on price.