Import libraries
library(tidyverse)
library(readr)
library(ggplot2)
Question 1
Scatterplot of average life expectancy for the country in years vs
sum of personal and government expenditures.
# Get the data
who_data <- read_csv("https://raw.githubusercontent.com/Kossi-Akplaka/Data605_Computational_mathematics/main/data605/Week%2012/who.csv")
# Create a scatterplot of LifeExp against TotExp
ggplot(who_data, aes(x = TotExp, y = LifeExp)) +
geom_point() +
labs(x = "Total Expenditure (USD)", y = "Life Expectancy (years)",
title = "Scatterplot of Life Expectancy vs. Total Expenditure")

Run simple linear regression
lm_model <- lm(LifeExp ~ TotExp, data = who_data)
summary(lm_model)
##
## Call:
## lm(formula = LifeExp ~ TotExp, data = who_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.764 -4.778 3.154 7.116 13.292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.475e+01 7.535e-01 85.933 < 2e-16 ***
## TotExp 6.297e-05 7.795e-06 8.079 7.71e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.371 on 188 degrees of freedom
## Multiple R-squared: 0.2577, Adjusted R-squared: 0.2537
## F-statistic: 65.26 on 1 and 188 DF, p-value: 7.714e-14
Provide explanation of F statistics, R^2, standard error,and
p-values only, and Discuss whether the assumptions of simple linear
regression met.
F-statistic: Overall significance of the regression model. In this
case, the F-statistic is 65.26 with a very low p-value (7.714e-14),
indicating that the regression model is statistically significant.
R-squared : Measures the proportion of variance in the dependent
variable (LifeExp) that is explained by the independent variable
(TotExp). Here, R-squared is 0.2577, which means that approximately
25.77% of the variability in life expectancy is explained by total
expenditure on healthcare.
Residual standard error: This represents the standard deviation of
the residuals, which are the differences between the observed and
predicted values. In this case, it’s approximately 9.371.
P-values: The p-value associated with the coefficient of TotExp is
7.71e-14, indicating that total expenditure on healthcare is
significantly associated with life expectancy.
The relationship doesn’t appear linear.
Question 2
# Transform variables
who_data_transformed <- who_data %>%
mutate(LifeExp_transformed = LifeExp^4.6,
TotExp_transformed = TotExp^0.06)
# Plot transformed variables
ggplot(who_data_transformed, aes(x = TotExp_transformed, y = LifeExp_transformed)) +
geom_point() +
labs(x = "Transformed Total Expenditure", y = "Transformed Life Expectancy",
title = "Scatterplot of Transformed Life Expectancy vs. Transformed Total Expenditure")

# Run simple linear regression with transformed variables
lm_model_transformed <- lm(LifeExp_transformed ~ TotExp_transformed, data = who_data_transformed)
summary(lm_model_transformed)
##
## Call:
## lm(formula = LifeExp_transformed ~ TotExp_transformed, data = who_data_transformed)
##
## Residuals:
## Min 1Q Median 3Q Max
## -308616089 -53978977 13697187 59139231 211951764
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -736527910 46817945 -15.73 <2e-16 ***
## TotExp_transformed 620060216 27518940 22.53 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 90490000 on 188 degrees of freedom
## Multiple R-squared: 0.7298, Adjusted R-squared: 0.7283
## F-statistic: 507.7 on 1 and 188 DF, p-value: < 2.2e-16
The R-squared value for the model using the transformed variables
(0.7298) is higher than that of the original model (0.2577). This
suggests that the model using the transformed variables explains more
variability in the data.
Question 3
# Coefficients
intercept <- -736527910
coef_totexp <- 620060216
# Values of TotExp_transformed
totexp_1 <- 1.5
totexp_2 <- 2.5
# Forecast life expectancy
lifeexp_1 <- intercept + coef_totexp * totexp_1
lifeexp_2 <- intercept + coef_totexp * totexp_2
# Print forecasts
print(paste("Forecasted life expectancy when TotExp^.06 = 1.5:", round(lifeexp_1, digits = 2)))
## [1] "Forecasted life expectancy when TotExp^.06 = 1.5: 193562414"
print(paste("Forecasted life expectancy when TotExp^.06 = 2.5:", round(lifeexp_2, digits = 2)))
## [1] "Forecasted life expectancy when TotExp^.06 = 2.5: 813622630"
Question 4
Build the following multiple regression model and interpret the F
Statistics, R^2, standard error, and p-values. How good is the
model?
# Run multiple linear regression
multiple_lm_model <- lm(LifeExp ~ PropMD + TotExp + PropMD * TotExp, data = who_data)
# Summary of the regression model
summary(multiple_lm_model)
##
## Call:
## lm(formula = LifeExp ~ PropMD + TotExp + PropMD * TotExp, data = who_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.320 -4.132 2.098 6.540 13.074
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.277e+01 7.956e-01 78.899 < 2e-16 ***
## PropMD 1.497e+03 2.788e+02 5.371 2.32e-07 ***
## TotExp 7.233e-05 8.982e-06 8.053 9.39e-14 ***
## PropMD:TotExp -6.026e-03 1.472e-03 -4.093 6.35e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.765 on 186 degrees of freedom
## Multiple R-squared: 0.3574, Adjusted R-squared: 0.3471
## F-statistic: 34.49 on 3 and 186 DF, p-value: < 2.2e-16
Based on these results, the model appears to be statistically
significant, as indicated by the low p-value of the F-statistic.
However, the R-squared value of 0.3574 suggests that the model explains
only about 35.74% of the variability in life expectancy, leaving a
considerable amount of variability unexplained. Therefore, while the
model is statistically significant, it may not be considered
particularly good in terms of explaining life expectancy
variability.
Question 5
# Coefficients
intercept <- 62.77
coef_propmd <- 1497
coef_totexp <- 0.00007233
coef_propmd_totexp <- -0.006026
# Values
propmd <- 0.03
totexp <- 14
# Forecast life expectancy
lifeexp_forecast <- intercept + coef_propmd * propmd + coef_totexp * totexp + coef_propmd_totexp * propmd * totexp
# Print forecast
print(paste("Forecasted life expectancy:", round(lifeexp_forecast, digits = 2)))
## [1] "Forecasted life expectancy: 107.68"
The forecasted life expectancy of 107.68 years seems extremely high
and unrealistic. …
LS0tDQp0aXRsZTogIldlZWsgMTIiDQphdXRob3I6ICJLb3NzaSBBa3BsYWthIg0KZGF0ZTogImByIFN5cy5EYXRlKClgIg0Kb3V0cHV0OiBvcGVuaW50cm86OmxhYl9yZXBvcnQNCi0tLQ0KDQojIyBJbXBvcnQgbGlicmFyaWVzDQoNCmBgYHtyIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9DQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkocmVhZHIpDQpsaWJyYXJ5KGdncGxvdDIpDQpgYGANCg0KDQojIyBRdWVzdGlvbiAxDQoNCiMjIyBTY2F0dGVycGxvdCBvZiBhdmVyYWdlIGxpZmUgZXhwZWN0YW5jeSBmb3IgdGhlIGNvdW50cnkgaW4geWVhcnMgdnMgc3VtIG9mIHBlcnNvbmFsIGFuZCBnb3Zlcm5tZW50IGV4cGVuZGl0dXJlcy4NCg0KYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0NCiMgR2V0IHRoZSBkYXRhDQp3aG9fZGF0YSA8LSByZWFkX2NzdigiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL0tvc3NpLUFrcGxha2EvRGF0YTYwNV9Db21wdXRhdGlvbmFsX21hdGhlbWF0aWNzL21haW4vZGF0YTYwNS9XZWVrJTIwMTIvd2hvLmNzdiIpDQoNCiMgQ3JlYXRlIGEgc2NhdHRlcnBsb3Qgb2YgTGlmZUV4cCBhZ2FpbnN0IFRvdEV4cA0KZ2dwbG90KHdob19kYXRhLCBhZXMoeCA9IFRvdEV4cCwgeSA9IExpZmVFeHApKSArDQogIGdlb21fcG9pbnQoKSArDQogIGxhYnMoeCA9ICJUb3RhbCBFeHBlbmRpdHVyZSAoVVNEKSIsIHkgPSAiTGlmZSBFeHBlY3RhbmN5ICh5ZWFycykiLA0KICAgICAgIHRpdGxlID0gIlNjYXR0ZXJwbG90IG9mIExpZmUgRXhwZWN0YW5jeSB2cy4gVG90YWwgRXhwZW5kaXR1cmUiKQ0KYGBgDQoNCiMjIyBSdW4gc2ltcGxlIGxpbmVhciByZWdyZXNzaW9uDQoNCmBgYHtyfQ0KbG1fbW9kZWwgPC0gbG0oTGlmZUV4cCB+IFRvdEV4cCwgZGF0YSA9IHdob19kYXRhKQ0Kc3VtbWFyeShsbV9tb2RlbCkNCmBgYA0KIyMjIFByb3ZpZGUgZXhwbGFuYXRpb24gb2YgRiBzdGF0aXN0aWNzLCBSXjIsIHN0YW5kYXJkIGVycm9yLGFuZCBwLXZhbHVlcyBvbmx5LCBhbmQgRGlzY3VzcyB3aGV0aGVyIHRoZSBhc3N1bXB0aW9ucyBvZiBzaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24gbWV0Lg0KDQpGLXN0YXRpc3RpYzogT3ZlcmFsbCBzaWduaWZpY2FuY2Ugb2YgdGhlIHJlZ3Jlc3Npb24gbW9kZWwuIEluIHRoaXMgY2FzZSwgdGhlIEYtc3RhdGlzdGljIGlzIDY1LjI2IHdpdGggYSB2ZXJ5IGxvdyBwLXZhbHVlICg3LjcxNGUtMTQpLCBpbmRpY2F0aW5nIHRoYXQgdGhlIHJlZ3Jlc3Npb24gbW9kZWwgaXMgc3RhdGlzdGljYWxseSBzaWduaWZpY2FudC4NCg0KUi1zcXVhcmVkIDogTWVhc3VyZXMgdGhlIHByb3BvcnRpb24gb2YgdmFyaWFuY2UgaW4gdGhlIGRlcGVuZGVudCB2YXJpYWJsZSAoTGlmZUV4cCkgdGhhdCBpcyBleHBsYWluZWQgYnkgdGhlIGluZGVwZW5kZW50IHZhcmlhYmxlIChUb3RFeHApLiBIZXJlLCBSLXNxdWFyZWQgaXMgMC4yNTc3LCB3aGljaCBtZWFucyB0aGF0IGFwcHJveGltYXRlbHkgMjUuNzclIG9mIHRoZSB2YXJpYWJpbGl0eSBpbiBsaWZlIGV4cGVjdGFuY3kgaXMgZXhwbGFpbmVkIGJ5IHRvdGFsIGV4cGVuZGl0dXJlIG9uIGhlYWx0aGNhcmUuDQoNClJlc2lkdWFsIHN0YW5kYXJkIGVycm9yOiBUaGlzIHJlcHJlc2VudHMgdGhlIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB0aGUgcmVzaWR1YWxzLCB3aGljaCBhcmUgdGhlIGRpZmZlcmVuY2VzIGJldHdlZW4gdGhlIG9ic2VydmVkIGFuZCBwcmVkaWN0ZWQgdmFsdWVzLiBJbiB0aGlzIGNhc2UsIGl0J3MgYXBwcm94aW1hdGVseSA5LjM3MS4NCg0KUC12YWx1ZXM6IFRoZSBwLXZhbHVlIGFzc29jaWF0ZWQgd2l0aCB0aGUgY29lZmZpY2llbnQgb2YgVG90RXhwIGlzIDcuNzFlLTE0LCBpbmRpY2F0aW5nIHRoYXQgdG90YWwgZXhwZW5kaXR1cmUgb24gaGVhbHRoY2FyZSBpcyBzaWduaWZpY2FudGx5IGFzc29jaWF0ZWQgd2l0aCBsaWZlIGV4cGVjdGFuY3kuDQoNClRoZSByZWxhdGlvbnNoaXAgZG9lc24ndCBhcHBlYXIgbGluZWFyLg0KDQojIyBRdWVzdGlvbiAyDQoNCmBgYHtyfQ0KIyBUcmFuc2Zvcm0gdmFyaWFibGVzDQp3aG9fZGF0YV90cmFuc2Zvcm1lZCA8LSB3aG9fZGF0YSAlPiUNCiAgbXV0YXRlKExpZmVFeHBfdHJhbnNmb3JtZWQgPSBMaWZlRXhwXjQuNiwNCiAgICAgICAgIFRvdEV4cF90cmFuc2Zvcm1lZCA9IFRvdEV4cF4wLjA2KQ0KDQojIFBsb3QgdHJhbnNmb3JtZWQgdmFyaWFibGVzDQpnZ3Bsb3Qod2hvX2RhdGFfdHJhbnNmb3JtZWQsIGFlcyh4ID0gVG90RXhwX3RyYW5zZm9ybWVkLCB5ID0gTGlmZUV4cF90cmFuc2Zvcm1lZCkpICsNCiAgZ2VvbV9wb2ludCgpICsNCiAgbGFicyh4ID0gIlRyYW5zZm9ybWVkIFRvdGFsIEV4cGVuZGl0dXJlIiwgeSA9ICJUcmFuc2Zvcm1lZCBMaWZlIEV4cGVjdGFuY3kiLA0KICAgICAgIHRpdGxlID0gIlNjYXR0ZXJwbG90IG9mIFRyYW5zZm9ybWVkIExpZmUgRXhwZWN0YW5jeSB2cy4gVHJhbnNmb3JtZWQgVG90YWwgRXhwZW5kaXR1cmUiKQ0KDQpgYGANCg0KYGBge3J9DQojIFJ1biBzaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24gd2l0aCB0cmFuc2Zvcm1lZCB2YXJpYWJsZXMNCmxtX21vZGVsX3RyYW5zZm9ybWVkIDwtIGxtKExpZmVFeHBfdHJhbnNmb3JtZWQgfiBUb3RFeHBfdHJhbnNmb3JtZWQsIGRhdGEgPSB3aG9fZGF0YV90cmFuc2Zvcm1lZCkNCnN1bW1hcnkobG1fbW9kZWxfdHJhbnNmb3JtZWQpDQpgYGANClRoZSBSLXNxdWFyZWQgdmFsdWUgZm9yIHRoZSBtb2RlbCB1c2luZyB0aGUgdHJhbnNmb3JtZWQgdmFyaWFibGVzICgwLjcyOTgpIGlzIGhpZ2hlciB0aGFuIHRoYXQgb2YgdGhlIG9yaWdpbmFsIG1vZGVsICgwLjI1NzcpLiBUaGlzIHN1Z2dlc3RzIHRoYXQgdGhlIG1vZGVsIHVzaW5nIHRoZSB0cmFuc2Zvcm1lZCB2YXJpYWJsZXMgZXhwbGFpbnMgbW9yZSB2YXJpYWJpbGl0eSBpbiB0aGUgZGF0YS4NCg0KIyMgUXVlc3Rpb24gMw0KDQpgYGB7cn0NCiMgQ29lZmZpY2llbnRzDQppbnRlcmNlcHQgPC0gLTczNjUyNzkxMA0KY29lZl90b3RleHAgPC0gNjIwMDYwMjE2DQoNCiMgVmFsdWVzIG9mIFRvdEV4cF90cmFuc2Zvcm1lZA0KdG90ZXhwXzEgPC0gMS41DQp0b3RleHBfMiA8LSAyLjUNCg0KIyBGb3JlY2FzdCBsaWZlIGV4cGVjdGFuY3kNCmxpZmVleHBfMSA8LSBpbnRlcmNlcHQgKyBjb2VmX3RvdGV4cCAqIHRvdGV4cF8xDQpsaWZlZXhwXzIgPC0gaW50ZXJjZXB0ICsgY29lZl90b3RleHAgKiB0b3RleHBfMg0KDQojIFByaW50IGZvcmVjYXN0cw0KcHJpbnQocGFzdGUoIkZvcmVjYXN0ZWQgbGlmZSBleHBlY3RhbmN5IHdoZW4gVG90RXhwXi4wNiA9IDEuNToiLCByb3VuZChsaWZlZXhwXzEsIGRpZ2l0cyA9IDIpKSkNCnByaW50KHBhc3RlKCJGb3JlY2FzdGVkIGxpZmUgZXhwZWN0YW5jeSB3aGVuIFRvdEV4cF4uMDYgPSAyLjU6Iiwgcm91bmQobGlmZWV4cF8yLCBkaWdpdHMgPSAyKSkpDQoNCmBgYA0KDQoNCiMjIFF1ZXN0aW9uIDQNCg0KQnVpbGQgdGhlIGZvbGxvd2luZyBtdWx0aXBsZSByZWdyZXNzaW9uIG1vZGVsIGFuZCBpbnRlcnByZXQgdGhlIEYgU3RhdGlzdGljcywgUl4yLCBzdGFuZGFyZCBlcnJvciwgYW5kIHAtdmFsdWVzLiBIb3cgZ29vZCBpcyB0aGUgbW9kZWw/DQoNCg0KYGBge3J9DQojIFJ1biBtdWx0aXBsZSBsaW5lYXIgcmVncmVzc2lvbg0KbXVsdGlwbGVfbG1fbW9kZWwgPC0gbG0oTGlmZUV4cCB+IFByb3BNRCArIFRvdEV4cCArIFByb3BNRCAqIFRvdEV4cCwgZGF0YSA9IHdob19kYXRhKQ0KDQojIFN1bW1hcnkgb2YgdGhlIHJlZ3Jlc3Npb24gbW9kZWwNCnN1bW1hcnkobXVsdGlwbGVfbG1fbW9kZWwpDQpgYGANCkJhc2VkIG9uIHRoZXNlIHJlc3VsdHMsIHRoZSBtb2RlbCBhcHBlYXJzIHRvIGJlIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQsIGFzIGluZGljYXRlZCBieSB0aGUgbG93IHAtdmFsdWUgb2YgdGhlIEYtc3RhdGlzdGljLiBIb3dldmVyLCB0aGUgUi1zcXVhcmVkIHZhbHVlIG9mIDAuMzU3NCBzdWdnZXN0cyB0aGF0IHRoZSBtb2RlbCBleHBsYWlucyBvbmx5IGFib3V0IDM1Ljc0JSBvZiB0aGUgdmFyaWFiaWxpdHkgaW4gbGlmZSBleHBlY3RhbmN5LCBsZWF2aW5nIGEgY29uc2lkZXJhYmxlIGFtb3VudCBvZiB2YXJpYWJpbGl0eSB1bmV4cGxhaW5lZC4gVGhlcmVmb3JlLCB3aGlsZSB0aGUgbW9kZWwgaXMgc3RhdGlzdGljYWxseSBzaWduaWZpY2FudCwgaXQgbWF5IG5vdCBiZSBjb25zaWRlcmVkIHBhcnRpY3VsYXJseSBnb29kIGluIHRlcm1zIG9mIGV4cGxhaW5pbmcgbGlmZSBleHBlY3RhbmN5IHZhcmlhYmlsaXR5Lg0KDQojIyBRdWVzdGlvbiA1DQoNCmBgYHtyfQ0KIyBDb2VmZmljaWVudHMNCmludGVyY2VwdCA8LSA2Mi43Nw0KY29lZl9wcm9wbWQgPC0gMTQ5Nw0KY29lZl90b3RleHAgPC0gMC4wMDAwNzIzMw0KY29lZl9wcm9wbWRfdG90ZXhwIDwtIC0wLjAwNjAyNg0KDQojIFZhbHVlcw0KcHJvcG1kIDwtIDAuMDMNCnRvdGV4cCA8LSAxNA0KDQojIEZvcmVjYXN0IGxpZmUgZXhwZWN0YW5jeQ0KbGlmZWV4cF9mb3JlY2FzdCA8LSBpbnRlcmNlcHQgKyBjb2VmX3Byb3BtZCAqIHByb3BtZCArIGNvZWZfdG90ZXhwICogdG90ZXhwICsgY29lZl9wcm9wbWRfdG90ZXhwICogcHJvcG1kICogdG90ZXhwDQoNCiMgUHJpbnQgZm9yZWNhc3QNCnByaW50KHBhc3RlKCJGb3JlY2FzdGVkIGxpZmUgZXhwZWN0YW5jeToiLCByb3VuZChsaWZlZXhwX2ZvcmVjYXN0LCBkaWdpdHMgPSAyKSkpDQoNCmBgYA0KDQpUaGUgZm9yZWNhc3RlZCBsaWZlIGV4cGVjdGFuY3kgb2YgMTA3LjY4IHllYXJzIHNlZW1zIGV4dHJlbWVseSBoaWdoIGFuZCB1bnJlYWxpc3RpYy4NCi4uLg0KDQo=