Problem 25

  1. Much stronger, since human life expectance has high leverage, and influencial

  2. Yes, they do not fit in the data becuase humans have devloped new ways to artifically live longer in comparison to other animals, skewing the data.

Problem 27

  1. Hippos, since they’re a relative outlier. Elephants increas the association since the cofirm the apparent trend.

  2. slope would increase

  3. no, it’s important to keep the model real. If we remove too many data points becuase we want our model to be stronger we will be manipulating the data.

  4. No, they were not influencial since they confirmed the previous trend.

Problem 33

##enter the data, 2 variable quatitative data
YearSince1900 = c(14,18,22,26,30,34,38,42,46,50,54,58,62,66,70,74,78,82,86,90,94,98,102,106)
CPI = c(10.0,15.1,16.8,17.7,16.7,13.4,14.1,16.3,19.5,24.1,26.9,28.9,30.2,32.4,38.8,49.3,65.2,96.5,109.6,130.7,148.2,163.0,179.9,201.6)

## make the scatterplot
plot(YearSince1900, CPI, col = "purple", type ='p', pch = 16)

## calculate the linear regression model
lm.r = lm(CPI~YearSince1900)

## add the regression line to the scatterplot
abline(lm.r, col = "dark green")

## state the correlation coeficient
cor(YearSince1900, CPI)
## [1] 0.8832971
## summary provides lots of information
summary(lm.r)
## 
## Call:
## lm(formula = CPI ~ YearSince1900)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -41.228 -24.142  -2.986  23.928  53.208 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -52.9032    14.1999  -3.726  0.00117 ** 
## YearSince1900   1.8990     0.2149   8.837 1.09e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 29.15 on 22 degrees of freedom
## Multiple R-squared:  0.7802, Adjusted R-squared:  0.7702 
## F-statistic:  78.1 on 1 and 22 DF,  p-value: 1.088e-08
##look at the residuals
resid(lm.r)
##           1           2           3           4           5           6 
##  36.3170000  33.8209565  27.9249130  21.2288696  12.6328261   1.7367826 
##           7           8           9          10          11          12 
##  -5.1592609 -10.5553043 -14.9513478 -17.9473913 -22.7434348 -28.3394783 
##          13          14          15          16          17          18 
## -34.6355217 -40.0315652 -41.2276087 -38.3236522 -30.0196957  -6.3157391 
##          19          20          21          22          23          24 
##  -0.8117826  12.6921739  22.5961304  29.8000870  39.1040435  53.2080000
plot(CPI,resid(lm.r), col = "red", type ='p', pch = 16, main = "Residual Plot")

##enter the data, 2 variable quatitative data
YearSince1900 = c(70,74,78,82,86,90,94,98,102,106)
CPI = c(38.8,49.3,65.2,96.5,109.6,130.7,148.2,163.0,179.9,201.6)

## make the scatterplot
plot(YearSince1900, CPI, col = "purple", type ='p', pch = 16)

## calculate the linear regression model
lm.r = lm(CPI~YearSince1900)

## add the regression line to the scatterplot
abline(lm.r, col = "dark green")

## state the correlation coeficient
cor(YearSince1900, CPI)
## [1] 0.997492
## summary provides lots of information
summary(lm.r)
## 
## Call:
## lm(formula = CPI ~ YearSince1900)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9497 -2.5744  0.4158  2.9559  5.8982 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -287.6667    10.2706  -28.01 2.85e-09 ***
## YearSince1900    4.6130     0.1157   39.86 1.73e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.205 on 8 degrees of freedom
## Multiple R-squared:  0.995,  Adjusted R-squared:  0.9944 
## F-statistic:  1589 on 1 and 8 DF,  p-value: 1.726e-10
##look at the residuals
resid(lm.r)
##          1          2          3          4          5          6 
##  3.5545455 -4.3975758 -6.9496970  5.8981818  0.5460606  3.1939394 
##          7          8          9         10 
##  2.2418182 -1.4103030 -2.9624242  0.2854545
plot(CPI,resid(lm.r), col = "red", type ='p', pch = 16, main = "Residual Plot")

lm.r$coefficients[1]+lm.r$coefficients[2]*116
## (Intercept) 
##    247.4448
-287.6667+4.61303*116
## [1] 247.4448