plot(x =df$TotExp  , y= df$LifeExp)

1

model <- lm(LifeExp ~ TotExp, data=df)
summary(model)
## 
## Call:
## lm(formula = LifeExp ~ TotExp, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.764  -4.778   3.154   7.116  13.292 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 6.475e+01  7.535e-01  85.933  < 2e-16 ***
## TotExp      6.297e-05  7.795e-06   8.079 7.71e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.371 on 188 degrees of freedom
## Multiple R-squared:  0.2577, Adjusted R-squared:  0.2537 
## F-statistic: 65.26 on 1 and 188 DF,  p-value: 7.714e-14
# Our model tells us that through the p-value being small thatr we can reject the null hypothesis and conclude that the predictor in our model has an effect on the overall response variable. This is also supported by the F1 score that tells us that the model fits the data

2

df$LifeExp = df$LifeExp**4.6
df$TotExp = df$TotExp**0.06

plot(x =df$TotExp  , y= df$LifeExp)

model <- lm(LifeExp ~ TotExp, data=df)
summary(model)
## 
## Call:
## lm(formula = LifeExp ~ TotExp, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -308616089  -53978977   13697187   59139231  211951764 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -736527910   46817945  -15.73   <2e-16 ***
## TotExp       620060216   27518940   22.53   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 90490000 on 188 degrees of freedom
## Multiple R-squared:  0.7298, Adjusted R-squared:  0.7283 
## F-statistic: 507.7 on 1 and 188 DF,  p-value: < 2.2e-16
# The F-statistic is even greater in this model suggesting it is strong for a linear model, as well as the p-value being smaller. We also see that with our R-squared value the model better fits the previous

3

pred1 <- predict(model, newdata = data.frame(TotExp = 1.5))
pred2 <- predict(model, newdata = data.frame(TotExp = 2.5))
# Print the predictions
cat("Life expectancy for TotExp^.06 = 1.5: ", pred1**(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 1.5:  63.31153
cat("Life expectancy for TotExp^.06 = 2.5: ", pred2**(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 2.5:  86.50645

4

model <- lm(LifeExp ~ PropMD + TotExp + PropMD*TotExp, data = df)
summary(model)
## 
## Call:
## lm(formula = LifeExp ~ PropMD + TotExp + PropMD * TotExp, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -296470018  -47729263   12183210   60285515  212311883 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.244e+08  5.083e+07 -14.253   <2e-16 ***
## PropMD         4.727e+10  2.258e+10   2.094   0.0376 *  
## TotExp         6.048e+08  3.023e+07  20.005   <2e-16 ***
## PropMD:TotExp -2.121e+10  1.131e+10  -1.876   0.0622 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 88520000 on 186 degrees of freedom
## Multiple R-squared:  0.7441, Adjusted R-squared:   0.74 
## F-statistic: 180.3 on 3 and 186 DF,  p-value: < 2.2e-16
# The model is the best in terms of R-squared value however it doesn't have as strong of an F statistic as 

5

pred1 <- predict(model, newdata = data.frame(TotExp = 14, PropMD =0.03))
cat("Life expectancy for TotExp^.06 = 14: ", pred1^(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 14:  66.97703

The prediction does not seem realistic at all in the fact that the TotalExP is so large. This should skew our model with something it hasn’t seen before.