plot(x =df$TotExp , y= df$LifeExp)
model <- lm(LifeExp ~ TotExp, data=df)
summary(model)
##
## Call:
## lm(formula = LifeExp ~ TotExp, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.764 -4.778 3.154 7.116 13.292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.475e+01 7.535e-01 85.933 < 2e-16 ***
## TotExp 6.297e-05 7.795e-06 8.079 7.71e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.371 on 188 degrees of freedom
## Multiple R-squared: 0.2577, Adjusted R-squared: 0.2537
## F-statistic: 65.26 on 1 and 188 DF, p-value: 7.714e-14
# Our model tells us that through the p-value being small thatr we can reject the null hypothesis and conclude that the predictor in our model has an effect on the overall response variable. This is also supported by the F1 score that tells us that the model fits the data
df$LifeExp = df$LifeExp**4.6
df$TotExp = df$TotExp**0.06
plot(x =df$TotExp , y= df$LifeExp)
model <- lm(LifeExp ~ TotExp, data=df)
summary(model)
##
## Call:
## lm(formula = LifeExp ~ TotExp, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -308616089 -53978977 13697187 59139231 211951764
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -736527910 46817945 -15.73 <2e-16 ***
## TotExp 620060216 27518940 22.53 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 90490000 on 188 degrees of freedom
## Multiple R-squared: 0.7298, Adjusted R-squared: 0.7283
## F-statistic: 507.7 on 1 and 188 DF, p-value: < 2.2e-16
# The F-statistic is even greater in this model suggesting it is strong for a linear model, as well as the p-value being smaller. We also see that with our R-squared value the model better fits the previous
pred1 <- predict(model, newdata = data.frame(TotExp = 1.5))
pred2 <- predict(model, newdata = data.frame(TotExp = 2.5))
# Print the predictions
cat("Life expectancy for TotExp^.06 = 1.5: ", pred1**(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 1.5: 63.31153
cat("Life expectancy for TotExp^.06 = 2.5: ", pred2**(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 2.5: 86.50645
model <- lm(LifeExp ~ PropMD + TotExp + PropMD*TotExp, data = df)
summary(model)
##
## Call:
## lm(formula = LifeExp ~ PropMD + TotExp + PropMD * TotExp, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -296470018 -47729263 12183210 60285515 212311883
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.244e+08 5.083e+07 -14.253 <2e-16 ***
## PropMD 4.727e+10 2.258e+10 2.094 0.0376 *
## TotExp 6.048e+08 3.023e+07 20.005 <2e-16 ***
## PropMD:TotExp -2.121e+10 1.131e+10 -1.876 0.0622 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 88520000 on 186 degrees of freedom
## Multiple R-squared: 0.7441, Adjusted R-squared: 0.74
## F-statistic: 180.3 on 3 and 186 DF, p-value: < 2.2e-16
# The model is the best in terms of R-squared value however it doesn't have as strong of an F statistic as
pred1 <- predict(model, newdata = data.frame(TotExp = 14, PropMD =0.03))
cat("Life expectancy for TotExp^.06 = 14: ", pred1^(1/4.6), "\n")
## Life expectancy for TotExp^.06 = 14: 66.97703
The prediction does not seem realistic at all in the fact that the TotalExP is so large. This should skew our model with something it hasn’t seen before.