The time series I selected for this assignment is from dataworld, originally sourced from the World Bank. The focal variable in this data set is the life expectancy at birth and is reported at the country level, by income group, for the years 1960 - 2015. Variables in the data set are country name, country code, nine regions (South Asia, Europe/Central Asia, Middle East/North Africa, EastAsia/Pacific, Sub-saharan Africa, Latin America/Caribbean, North America), four income groups (low income, lower middle income, upper middle income, high income), and the reported life expectancy at birth for the year.
A scatterplot of year and average life expectancy shows a general positive trend, and slight decrease in variation associated with life expectancy. When paneled by region, the impact of warfare in particular regions can be discerned. Boxplots of life expectancy by region and by income show marked differences between the average life expectancy for each region, and an increase in life expectancy with higher incomes.
## Warning: Removed 1230 rows containing missing values (geom_point).
## Warning: Removed 1230 rows containing missing values (geom_point).
## Warning: Removed 1230 rows containing non-finite values (stat_boxplot).
## Warning: Removed 1230 rows containing non-finite values (stat_boxplot).
Given the rather obvious linear relationship observed in the scatter plots above, we first explore a multiple linear regression model to understand the relationship between life expectancy and birth year, with consideration for country, region, and level of income.
We fit a simple linear regression model of birth year to life expectancy; the residual plot shows no discernible pattern, and the normal QQ plot shows a reasonable fit, with some upper tail (indicating positive skew) shown. The effect size is fairly strong; a one year higher birth year indicates an increase of .31 years life expectancy, on average. Although statistically significant (F-stat p-value < 0.001) this model is relatively weak, with an R-squared of only .1888.
We next fit a multiple Linear regression model with Life expectancy as the dependent variable, and include birth year, region, and income level as independent variables. The regression diagnostic plots (residual plot and normal QQ plot) are both acceptable, this is a reasonable model. The model is statistically significant (F-stat p-value < 0.001) as are all the variables in the mode. The model is quite strong; with an R-squared value 0.8052 (adjusted R-square 0.8050), 80% of the variability in life expectancy is described by the variation in birth year, region, and income level, taken together. In this model, one higher birth year yields, on average, 3 years longer life expectancy (all else held equal). The income effect sizes are quite interesting as they provide insight into the average life expectancy across incomes, holding all other variables (year of birth and region) constant. Lower middle income has, on average, 4.6 years longer life expectancy than low income; upper middle income has 9.7 years longer life expectancy than low income; and high income has 14.9 years longer life expectancy than low income.
The final fitted multiple Linear regression model with Life expectancy as the dependent variable, and include birth year and income level as independent variables; instead of region, each country is included as an independent variable. The regression diagnostic plots are a bit worrisome, with some deviations from equal variance found in the residual plot as well as tail deviations in the normal QQ plot. The model is statistically significant (F-stat p-value < 0.001) as are all the variables in the mode. The model is very strong; with an R-squared value 0.9261 (adjusted R-squared 0.9247), 93% of the variability in life expectancy is described by the variation in birth year, country, and income level, taken together. From an effect size perspective, there are over 200 countries in the data set, making those difficult to interpret and assess well. The inclusion of each country modifies the relative effect sizes of year and income in the context of the country-level differences rather than at the aggregated region level. Like the previous model, one higher birth year yields, on average, 3 years longer life expectancy (all else held equal). The income effect sizes are quite different in this model; lower middle income has, on average, 1.7 years longer life expectancy than low income; upper middle income has 21.7 years longer life expectancy than low income; and high income has 26.1 years longer life expectancy than low income.
lifemodel <- lm(LifeExpectancy ~ Year, data = Life)
summary(lifemodel)
##
## Call:
## lm(formula = LifeExpectancy ~ Year, data = Life)
##
## Residuals:
## Min 1Q Median 3Q Max
## -40.985 -7.717 2.763 8.290 18.560
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.516e+02 1.220e+01 -45.19 <2e-16 ***
## Year 3.095e-01 6.139e-03 50.41 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.37 on 10920 degrees of freedom
## (1230 observations deleted due to missingness)
## Multiple R-squared: 0.1888, Adjusted R-squared: 0.1887
## F-statistic: 2541 on 1 and 10920 DF, p-value: < 2.2e-16
plot(lifemodel, which = 1:2)
lifemodel2 <- lm(LifeExpectancy ~ Year + Region + Income, data = Life)
summary(lifemodel2)
##
## Call:
## lm(formula = LifeExpectancy ~ Year + Region + Income, data = Life)
##
## Residuals:
## Min 1Q Median 3Q Max
## -38.368 -2.486 0.390 3.118 14.911
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.416e+02 5.986e+00 -90.47 < 2e-16 ***
## Year 3.008e-01 3.011e-03 99.90 < 2e-16 ***
## RegionEuropeCentralAsia 3.478e+00 1.594e-01 21.82 < 2e-16 ***
## RegionLatinAmericaCaribbean 1.143e+00 1.734e-01 6.59 4.6e-11 ***
## RegionMiddleEastNorthAfrica -5.677e-01 1.938e-01 -2.93 0.0034 **
## RegionNorthAmerica 4.188e+00 4.659e-01 8.99 < 2e-16 ***
## RegionSouthAsia -3.916e+00 2.765e-01 -14.16 < 2e-16 ***
## RegionSubSaharanAfrica -8.437e+00 1.874e-01 -45.03 < 2e-16 ***
## IncomeLowerMiddleIncome 4.629e+00 1.814e-01 25.52 < 2e-16 ***
## IncomeUpperMiddleIncome 9.700e+00 2.006e-01 48.37 < 2e-16 ***
## IncomeHighIncome 1.491e+01 2.117e-01 70.41 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.082 on 10911 degrees of freedom
## (1230 observations deleted due to missingness)
## Multiple R-squared: 0.8052, Adjusted R-squared: 0.805
## F-statistic: 4510 on 10 and 10911 DF, p-value: < 2.2e-16
plot(lifemodel2, which = 1:2)
lifemodel3 <- lm(LifeExpectancy ~ Year + Income + Country, data = Life)
summary(lifemodel3)
##
## Call:
## lm(formula = LifeExpectancy ~ Year + Income + Country, data = Life)
##
## Residuals:
## Min 1Q Median 3Q Max
## -26.4939 -1.5225 0.2014 1.7082 12.0414
##
## Coefficients: (3 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.499e+02 3.775e+00 -145.691 < 2e-16
## Year 3.004e-01 1.887e-03 159.172 < 2e-16
## IncomeLowerMiddleIncome 1.658e+00 5.969e-01 2.778 0.005474
## IncomeUpperMiddleIncome 2.173e+01 5.969e-01 36.408 < 2e-16
## IncomeHighIncome 2.611e+01 5.969e-01 43.738 < 2e-16
## CountryAlbania 2.518e+00 5.969e-01 4.219 2.47e-05
## CountryAlgeria -6.565e+00 5.969e-01 -10.998 < 2e-16
## CountryAngola -6.732e+00 5.969e-01 -11.280 < 2e-16
## CountryAntiguaandBarbuda -3.191e+00 5.969e-01 -5.346 9.18e-08
## CountryArgentina 2.000e+00 5.969e-01 3.350 0.000810
## CountryArmenia 2.162e+01 5.969e-01 36.222 < 2e-16
## CountryAruba -1.151e+00 5.969e-01 -1.929 0.053777
## CountryAustralia 2.954e+00 5.969e-01 4.950 7.55e-07
## CountryAustria 1.638e+00 5.969e-01 2.744 0.006071
## CountryAzerbaijan -3.002e+00 5.969e-01 -5.029 5.01e-07
## CountryBahamas -3.505e+00 5.969e-01 -5.873 4.40e-09
## CountryBahrain -3.505e+00 5.969e-01 -5.872 4.42e-09
## CountryBangladesh 9.139e+00 5.969e-01 15.312 < 2e-16
## CountryBarbados -3.442e+00 5.969e-01 -5.766 8.33e-09
## CountryBelarus 1.034e+00 5.969e-01 1.732 0.083237
## CountryBelgium 1.966e+00 5.969e-01 3.295 0.000989
## CountryBelize -7.076e-01 5.969e-01 -1.185 0.235866
## CountryBenin 3.038e+00 5.969e-01 5.089 3.65e-07
## CountryBermuda 4.216e-01 8.231e-01 0.512 0.608530
## CountryBhutan 2.057e+00 5.969e-01 3.447 0.000569
## CountryBolivia 5.664e+00 5.969e-01 9.490 < 2e-16
## CountryBosniaandHerzegovina 1.639e+00 5.969e-01 2.745 0.006058
## CountryBotswana -1.154e+01 5.969e-01 -19.329 < 2e-16
## CountryBrazil -3.866e+00 5.969e-01 -6.477 9.77e-11
## CountryBruneiDarussalam -1.419e+00 5.969e-01 -2.377 0.017475
## CountryBulgaria 2.809e+00 5.969e-01 4.706 2.56e-06
## CountryBurkinaFaso 7.441e-02 5.969e-01 0.125 0.900793
## CountryBurundi 1.330e+00 5.969e-01 2.228 0.025930
## CountryCaboVerde 1.426e+01 5.969e-01 23.888 < 2e-16
## CountryCambodia 1.367e-01 5.969e-01 0.229 0.818904
## CountryCameroon 1.705e+00 5.969e-01 2.857 0.004290
## CountryCanada 3.331e+00 5.969e-01 5.581 2.45e-08
## CountryCentralAfricanRepublic -1.519e+00 5.969e-01 -2.545 0.010935
## CountryChad -1.732e+00 5.969e-01 -2.903 0.003709
## CountryChannelIslands 1.961e+00 5.969e-01 3.285 0.001024
## CountryChile -2.380e+00 5.969e-01 -3.988 6.71e-05
## CountryChina -3.040e+00 5.969e-01 -5.093 3.59e-07
## CountryColombia -2.003e+00 5.969e-01 -3.356 0.000794
## CountryComoros 6.554e+00 5.969e-01 10.980 < 2e-16
## CountryCongoDemRep 1.426e+00 5.969e-01 2.389 0.016896
## CountryCongoRep 5.654e+00 5.969e-01 9.473 < 2e-16
## CountryCostaRica 4.082e+00 5.969e-01 6.839 8.42e-12
## CountryCote_dIvoire -1.001e+00 5.969e-01 -1.677 0.093669
## CountryCroatia 2.605e+00 5.969e-01 4.365 1.29e-05
## CountryCuba 5.015e+00 5.969e-01 8.402 < 2e-16
## CountryCuracao -3.594e+00 1.357e+00 -2.648 0.008117
## CountryCyprus 2.452e+00 5.969e-01 4.108 4.02e-05
## CountryCzechRepublic -4.708e-01 5.969e-01 -0.789 0.430237
## CountryDenmark 2.165e+00 5.969e-01 3.627 0.000288
## CountryDjibouti 5.608e+00 5.969e-01 9.396 < 2e-16
## CountryDominica 3.784e+00 1.474e+00 2.567 0.010270
## CountryDominicanRepublic -3.529e+00 5.969e-01 -5.913 3.46e-09
## CountryEcuador -2.501e+00 5.969e-01 -4.189 2.82e-05
## CountryEgypt 1.273e+01 5.969e-01 21.326 < 2e-16
## CountryElSalvador 1.331e+01 5.969e-01 22.305 < 2e-16
## CountryEquatorialGuinea -2.185e+01 5.969e-01 -36.604 < 2e-16
## CountryEritrea 1.750e+00 5.969e-01 2.931 0.003381
## CountryEstonia -2.638e+00 5.969e-01 -4.420 9.96e-06
## CountryEthiopia 1.481e+00 5.969e-01 2.481 0.013131
## CountryFaroeIslands 1.641e+00 7.002e-01 2.344 0.019093
## CountryFiji -4.529e+00 5.969e-01 -7.587 3.54e-14
## CountryFinland 1.481e+00 5.969e-01 2.481 0.013119
## CountryFrance 2.811e+00 5.969e-01 4.710 2.50e-06
## CountryFrenchPolynesia -5.869e+00 5.969e-01 -9.833 < 2e-16
## CountryGabon -1.354e+01 5.969e-01 -22.690 < 2e-16
## CountryGambia 1.505e+00 5.969e-01 2.521 0.011722
## CountryGeorgia 2.107e+01 5.969e-01 35.308 < 2e-16
## CountryGermany 1.582e+00 5.969e-01 2.651 0.008038
## CountryGhana 5.633e+00 5.969e-01 9.437 < 2e-16
## CountryGreece 2.112e+00 5.969e-01 3.538 0.000405
## CountryGreenland -9.049e+00 6.749e-01 -13.409 < 2e-16
## CountryGrenada -1.306e+00 5.969e-01 -2.188 0.028695
## CountryGuam -2.034e+00 5.969e-01 -3.408 0.000657
## CountryGuatemala 1.180e+01 5.969e-01 19.769 < 2e-16
## CountryGuinea -1.075e+00 5.969e-01 -1.802 0.071633
## CountryGuineaBissau 2.627e-01 5.969e-01 0.440 0.659850
## CountryGuyana -5.337e+00 5.969e-01 -8.942 < 2e-16
## CountryHaiti 6.172e+00 5.969e-01 10.341 < 2e-16
## CountryHonduras 1.399e+01 5.969e-01 23.440 < 2e-16
## CountryHongKong 3.339e+00 5.969e-01 5.595 2.26e-08
## CountryHungary -2.613e+00 5.969e-01 -4.377 1.21e-05
## CountryIceland 4.544e+00 5.969e-01 7.613 2.91e-14
## CountryIndia 7.541e+00 5.969e-01 12.634 < 2e-16
## CountryIndonesia 1.240e+01 5.969e-01 20.780 < 2e-16
## CountryIran -7.685e+00 5.969e-01 -12.875 < 2e-16
## CountryIraq -5.739e+00 5.969e-01 -9.615 < 2e-16
## CountryIreland 1.460e+00 5.969e-01 2.446 0.014445
## CountryIsleofMan 8.688e-01 2.273e+00 0.382 0.702290
## CountryIsrael 2.495e+00 6.083e-01 4.102 4.13e-05
## CountryItaly 2.957e+00 5.969e-01 4.954 7.39e-07
## CountryJamaica 2.324e+00 5.969e-01 3.894 9.93e-05
## CountryJapan 4.230e+00 5.969e-01 7.087 1.46e-12
## CountryJordan 1.818e+01 5.969e-01 30.458 < 2e-16
## CountryKazakhstan -3.490e+00 5.969e-01 -5.847 5.15e-09
## CountryKenya 6.183e+00 5.969e-01 10.358 < 2e-16
## CountryKiribati 1.047e+01 5.969e-01 17.540 < 2e-16
## CountryKoreaDemPeoplesRep 1.717e+01 5.969e-01 28.758 < 2e-16
## CountryKoreaRep -3.804e+00 5.969e-01 -6.373 1.94e-10
## CountryKosovo 1.646e+01 6.808e-01 24.177 < 2e-16
## CountryKuwait -2.995e+00 5.969e-01 -5.017 5.33e-07
## CountryKyrgyzRepublic 1.585e+01 5.969e-01 26.558 < 2e-16
## CountryLaoPDR 4.714e+00 5.969e-01 7.899 3.10e-15
## CountryLatvia -3.013e+00 5.969e-01 -5.048 4.54e-07
## CountryLebanon 1.921e+00 5.969e-01 3.219 0.001290
## CountryLesotho 2.139e+00 5.969e-01 3.583 0.000341
## CountryLiberia 5.393e-01 5.969e-01 0.903 0.366295
## CountryLibya -4.580e+00 5.969e-01 -7.673 1.83e-14
## CountryLiechtenstein 1.784e+00 7.953e-01 2.243 0.024897
## CountryLithuania -1.947e+00 5.969e-01 -3.262 0.001110
## CountryLuxembourg 1.371e+00 5.969e-01 2.297 0.021626
## CountryMacao 7.330e-01 5.969e-01 1.228 0.219451
## CountryMacedonia 1.106e+00 5.969e-01 1.852 0.063991
## CountryMadagascar 5.110e+00 5.969e-01 8.561 < 2e-16
## CountryMalawi -1.673e+00 5.969e-01 -2.803 0.005068
## CountryMalaysia 3.154e-01 5.969e-01 0.528 0.597230
## CountryMaldives -9.954e+00 5.969e-01 -16.677 < 2e-16
## CountryMali -3.856e+00 5.969e-01 -6.461 1.09e-10
## CountryMalta 1.626e+00 5.969e-01 2.724 0.006460
## CountryMarshallIslands -2.907e+00 1.872e+00 -1.553 0.120395
## CountryMauritania 6.840e+00 5.969e-01 11.460 < 2e-16
## CountryMauritius -8.094e-01 5.969e-01 -1.356 0.175080
## CountryMexico -1.204e-01 5.969e-01 -0.202 0.840200
## CountryMicronesia 1.630e+01 5.969e-01 27.308 < 2e-16
## CountryMoldova 1.756e+01 5.969e-01 29.419 < 2e-16
## CountryMongolia 1.088e+01 5.969e-01 18.233 < 2e-16
## CountryMontenegro 3.432e+00 5.969e-01 5.750 9.17e-09
## CountryMorocco 1.312e+01 5.969e-01 21.975 < 2e-16
## CountryMozambique -2.707e+00 5.969e-01 -4.535 5.82e-06
## CountryMyanmar 8.166e+00 5.969e-01 13.682 < 2e-16
## CountryNamibia -1.196e+01 5.969e-01 -20.031 < 2e-16
## CountryNepal 5.519e+00 5.969e-01 9.246 < 2e-16
## CountryNetherlands 3.504e+00 5.969e-01 5.870 4.48e-09
## CountryNewCaledonia -3.982e+00 5.969e-01 -6.671 2.66e-11
## CountryNewZealand 2.208e+00 5.969e-01 3.700 0.000217
## CountryNicaragua 1.368e+01 5.969e-01 22.924 < 2e-16
## CountryNiger -2.223e+00 5.969e-01 -3.724 0.000197
## CountryNigeria -3.348e+00 5.969e-01 -5.609 2.09e-08
## CountryNorway 3.722e+00 5.969e-01 6.235 4.68e-10
## CountryOman -1.005e+01 5.969e-01 -16.836 < 2e-16
## CountryPakistan 9.624e+00 5.969e-01 16.124 < 2e-16
## CountryPalau -6.094e+00 1.635e+00 -3.728 0.000194
## CountryPanama 2.351e+00 5.969e-01 3.938 8.26e-05
## CountryPapuaNewGuinea 4.837e+00 5.969e-01 8.104 5.89e-16
## CountryParaguay -7.107e-01 5.969e-01 -1.191 0.233823
## CountryPeru -5.676e+00 5.969e-01 -9.510 < 2e-16
## CountryPhilippines 1.520e+01 5.969e-01 25.464 < 2e-16
## CountryPoland -1.232e+00 5.969e-01 -2.065 0.038991
## CountryPortugal -5.210e-01 5.969e-01 -0.873 0.382765
## CountryPuertoRico 1.133e+00 5.969e-01 1.899 0.057620
## CountryQatar -1.855e-01 5.969e-01 -0.311 0.755964
## CountryRomania 1.166e+00 5.969e-01 1.953 0.050793
## CountryRussianFederation -1.309e+00 5.969e-01 -2.194 0.028275
## CountryRwanda -2.745e-01 5.969e-01 -0.460 0.645586
## CountrySamoa -5.712e+00 5.969e-01 -9.570 < 2e-16
## CountrySanMarino 3.464e+00 8.958e-01 3.867 0.000111
## CountrySaoTomeandPrincipe 1.175e+01 5.969e-01 19.681 < 2e-16
## CountrySaudiArabia -8.842e+00 5.969e-01 -14.815 < 2e-16
## CountrySenegal 4.990e+00 5.969e-01 8.361 < 2e-16
## CountrySerbia -9.516e-01 8.565e-01 -1.111 0.266580
## CountrySeychelles -6.011e+00 8.563e-01 -7.020 2.36e-12
## CountrySierraLeone -7.999e+00 5.969e-01 -13.402 < 2e-16
## CountrySingapore 1.019e+00 5.969e-01 1.707 0.087891
## CountrySlovakRepublic -1.201e+00 5.969e-01 -2.013 0.044183
## CountrySlovenia -6.569e-02 5.969e-01 -0.110 0.912366
## CountrySolomonIslands 1.032e+01 5.969e-01 17.295 < 2e-16
## CountrySomalia -7.696e-01 5.969e-01 -1.289 0.197316
## CountrySouthAfrica -1.309e+01 5.969e-01 -21.924 < 2e-16
## CountrySouthSudan -3.910e+00 5.969e-01 -6.551 5.99e-11
## CountrySpain 3.120e+00 5.969e-01 5.228 1.75e-07
## CountrySriLanka 1.991e+01 5.969e-01 33.364 < 2e-16
## CountryStKittsandNevis -6.731e+00 1.474e+00 -4.566 5.03e-06
## CountryStLucia -1.656e-01 5.969e-01 -0.278 0.781393
## CountryStMaartenDutch -5.253e+00 1.635e+00 -3.212 0.001320
## CountryStMartinFrench -8.879e-02 6.870e-01 -0.129 0.897165
## CountryStVincentandtheGrenadines -6.963e-01 5.969e-01 -1.167 0.243373
## CountrySudan 6.957e+00 5.969e-01 11.655 < 2e-16
## CountrySuriname -2.468e+00 5.969e-01 -4.135 3.58e-05
## CountrySwaziland 2.194e+00 5.969e-01 3.676 0.000238
## CountrySweden 4.206e+00 5.969e-01 7.046 1.95e-12
## CountrySwitzerland 3.918e+00 5.969e-01 6.564 5.49e-11
## CountrySyrianArabRepublic 1.788e+01 5.969e-01 29.957 < 2e-16
## CountryTajikistan 1.411e+01 5.969e-01 23.634 < 2e-16
## CountryTanzania 4.107e+00 5.969e-01 6.880 6.31e-12
## CountryThailand -2.285e+00 5.969e-01 -3.829 0.000130
## CountryTimorLeste -2.688e-01 5.969e-01 -0.450 0.652498
## CountryTogo 4.897e+00 5.969e-01 8.205 2.57e-16
## CountryTonga -5.050e-01 5.969e-01 -0.846 0.397552
## CountryTrinidadandTobago -5.866e+00 5.969e-01 -9.829 < 2e-16
## CountryTunisia 1.503e+01 5.969e-01 25.174 < 2e-16
## CountryTurkey -6.569e+00 5.969e-01 -11.005 < 2e-16
## CountryTurkmenistan -7.341e+00 5.969e-01 -12.300 < 2e-16
## CountryUganda 1.684e+00 5.969e-01 2.822 0.004785
## CountryUkraine 2.048e+01 5.969e-01 34.312 < 2e-16
## CountryUnitedArabEmirates -4.248e+00 5.969e-01 -7.117 1.17e-12
## CountryUnitedKingdom 2.301e+00 5.969e-01 3.854 0.000117
## CountryUnitedStates 1.308e+00 5.969e-01 2.192 0.028418
## CountryUruguay -1.068e+00 5.969e-01 -1.790 0.073528
## CountryUzbekistan 1.653e+01 5.969e-01 27.696 < 2e-16
## CountryVanuatu 1.236e+01 5.969e-01 20.715 < 2e-16
## CountryVenezuela NA NA NA NA
## CountryVietnam 1.955e+01 5.969e-01 32.750 < 2e-16
## CountryVirginIslandsUS NA NA NA NA
## CountryWestBankandGaza 1.764e+01 7.501e-01 23.521 < 2e-16
## CountryYemen 3.866e+00 5.969e-01 6.477 9.75e-11
## CountryZambia NA NA NA NA
## CountryZimbabwe 6.001e+00 5.969e-01 10.055 < 2e-16
##
## (Intercept) ***
## Year ***
## IncomeLowerMiddleIncome **
## IncomeUpperMiddleIncome ***
## IncomeHighIncome ***
## CountryAlbania ***
## CountryAlgeria ***
## CountryAngola ***
## CountryAntiguaandBarbuda ***
## CountryArgentina ***
## CountryArmenia ***
## CountryAruba .
## CountryAustralia ***
## CountryAustria **
## CountryAzerbaijan ***
## CountryBahamas ***
## CountryBahrain ***
## CountryBangladesh ***
## CountryBarbados ***
## CountryBelarus .
## CountryBelgium ***
## CountryBelize
## CountryBenin ***
## CountryBermuda
## CountryBhutan ***
## CountryBolivia ***
## CountryBosniaandHerzegovina **
## CountryBotswana ***
## CountryBrazil ***
## CountryBruneiDarussalam *
## CountryBulgaria ***
## CountryBurkinaFaso
## CountryBurundi *
## CountryCaboVerde ***
## CountryCambodia
## CountryCameroon **
## CountryCanada ***
## CountryCentralAfricanRepublic *
## CountryChad **
## CountryChannelIslands **
## CountryChile ***
## CountryChina ***
## CountryColombia ***
## CountryComoros ***
## CountryCongoDemRep *
## CountryCongoRep ***
## CountryCostaRica ***
## CountryCote_dIvoire .
## CountryCroatia ***
## CountryCuba ***
## CountryCuracao **
## CountryCyprus ***
## CountryCzechRepublic
## CountryDenmark ***
## CountryDjibouti ***
## CountryDominica *
## CountryDominicanRepublic ***
## CountryEcuador ***
## CountryEgypt ***
## CountryElSalvador ***
## CountryEquatorialGuinea ***
## CountryEritrea **
## CountryEstonia ***
## CountryEthiopia *
## CountryFaroeIslands *
## CountryFiji ***
## CountryFinland *
## CountryFrance ***
## CountryFrenchPolynesia ***
## CountryGabon ***
## CountryGambia *
## CountryGeorgia ***
## CountryGermany **
## CountryGhana ***
## CountryGreece ***
## CountryGreenland ***
## CountryGrenada *
## CountryGuam ***
## CountryGuatemala ***
## CountryGuinea .
## CountryGuineaBissau
## CountryGuyana ***
## CountryHaiti ***
## CountryHonduras ***
## CountryHongKong ***
## CountryHungary ***
## CountryIceland ***
## CountryIndia ***
## CountryIndonesia ***
## CountryIran ***
## CountryIraq ***
## CountryIreland *
## CountryIsleofMan
## CountryIsrael ***
## CountryItaly ***
## CountryJamaica ***
## CountryJapan ***
## CountryJordan ***
## CountryKazakhstan ***
## CountryKenya ***
## CountryKiribati ***
## CountryKoreaDemPeoplesRep ***
## CountryKoreaRep ***
## CountryKosovo ***
## CountryKuwait ***
## CountryKyrgyzRepublic ***
## CountryLaoPDR ***
## CountryLatvia ***
## CountryLebanon **
## CountryLesotho ***
## CountryLiberia
## CountryLibya ***
## CountryLiechtenstein *
## CountryLithuania **
## CountryLuxembourg *
## CountryMacao
## CountryMacedonia .
## CountryMadagascar ***
## CountryMalawi **
## CountryMalaysia
## CountryMaldives ***
## CountryMali ***
## CountryMalta **
## CountryMarshallIslands
## CountryMauritania ***
## CountryMauritius
## CountryMexico
## CountryMicronesia ***
## CountryMoldova ***
## CountryMongolia ***
## CountryMontenegro ***
## CountryMorocco ***
## CountryMozambique ***
## CountryMyanmar ***
## CountryNamibia ***
## CountryNepal ***
## CountryNetherlands ***
## CountryNewCaledonia ***
## CountryNewZealand ***
## CountryNicaragua ***
## CountryNiger ***
## CountryNigeria ***
## CountryNorway ***
## CountryOman ***
## CountryPakistan ***
## CountryPalau ***
## CountryPanama ***
## CountryPapuaNewGuinea ***
## CountryParaguay
## CountryPeru ***
## CountryPhilippines ***
## CountryPoland *
## CountryPortugal
## CountryPuertoRico .
## CountryQatar
## CountryRomania .
## CountryRussianFederation *
## CountryRwanda
## CountrySamoa ***
## CountrySanMarino ***
## CountrySaoTomeandPrincipe ***
## CountrySaudiArabia ***
## CountrySenegal ***
## CountrySerbia
## CountrySeychelles ***
## CountrySierraLeone ***
## CountrySingapore .
## CountrySlovakRepublic *
## CountrySlovenia
## CountrySolomonIslands ***
## CountrySomalia
## CountrySouthAfrica ***
## CountrySouthSudan ***
## CountrySpain ***
## CountrySriLanka ***
## CountryStKittsandNevis ***
## CountryStLucia
## CountryStMaartenDutch **
## CountryStMartinFrench
## CountryStVincentandtheGrenadines
## CountrySudan ***
## CountrySuriname ***
## CountrySwaziland ***
## CountrySweden ***
## CountrySwitzerland ***
## CountrySyrianArabRepublic ***
## CountryTajikistan ***
## CountryTanzania ***
## CountryThailand ***
## CountryTimorLeste
## CountryTogo ***
## CountryTonga
## CountryTrinidadandTobago ***
## CountryTunisia ***
## CountryTurkey ***
## CountryTurkmenistan ***
## CountryUganda **
## CountryUkraine ***
## CountryUnitedArabEmirates ***
## CountryUnitedKingdom ***
## CountryUnitedStates *
## CountryUruguay .
## CountryUzbekistan ***
## CountryVanuatu ***
## CountryVenezuela
## CountryVietnam ***
## CountryVirginIslandsUS
## CountryWestBankandGaza ***
## CountryYemen ***
## CountryZambia
## CountryZimbabwe ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.158 on 10714 degrees of freedom
## (1230 observations deleted due to missingness)
## Multiple R-squared: 0.9261, Adjusted R-squared: 0.9247
## F-statistic: 648.8 on 207 and 10714 DF, p-value: < 2.2e-16
plot(lifemodel3, which = 1:2)
To evaluate as a time series model, we needed to reduce the full data set to a time series made up of the year and life expectancy associated with that year. Based on the regressions above, summarizing a macro-level ‘average’ life expectancy at a world level seems uninformative given the wide differences in life expectancy between country, region and income levels as discerned in the exploratory data analysis and basic regression modeling. We therefore decided to focus on country-level data for application of traditional time series models.
China <- filter(Life, Country == "China")
tally(~Income, data=China)
## Income
## LowIncome LowerMiddleIncome UpperMiddleIncome HighIncome
## 0 0 56 0
gf_point(LifeExpectancy ~ Year, data = China, color = "#5e4fa2", title = "Life Expectancy at Year of Birth", sub = "Country = China", xlab = "Birth Year", ylab = "Average Life Expectancy") + theme_bw()
Chinats <- ts(China$LifeExpectancy, start = c(1960))
USA <- filter(Life, Country == "UnitedStates")
tally(~Income, data=USA)
## Income
## LowIncome LowerMiddleIncome UpperMiddleIncome HighIncome
## 0 0 0 56
gf_point(LifeExpectancy ~ Year, data = USA, color = "#5e4fa2", title = "Life Expectancy at Year of Birth", sub = "Country = USA", xlab = "Birth Year", ylab = "Average Life Expectancy") + theme_bw()
USAts <- ts(USA$LifeExpectancy, start = c(1960))
We fit and visualize basic exponential smoothing model to the time series data for China and the USA. The ets model fits the China life expectancy data extremely well, the fitted model tightly hugging the actual reported life expectancy. For the USA data, the ets model is less satisfactory, given the rather jagged reported life expectancy, the exponential smoothing model is generally lower than and a bit behind the actual life expectancy data.
Given both data sets have a readily discernible linear trend, We next fit Holt’s linear trend model to the data sets. As seen below, this methodology fits the USA life expectancy far better than it does the China data when we window the data to forecast the last 15 periods of the actual data set.
expChina <- ses(Chinats, h=15)
autoplot(Chinats) + autolayer(fitted(expChina), series = "Fitted") + ylab("Life Expectancy - China") + xlab("Year") + theme_bw()
expUSA <- ses(USAts, h=10)
autoplot(USAts) + autolayer(fitted(expUSA), series = "Fitted")+ ylab("Life Expectancy - USA") + xlab("Year") + theme_bw()
holtChina <- holt(window(Chinats, end=2000), h=15)
autoplot(Chinats) + autolayer(holtChina, series = "Holts Method", PI=FALSE) + ylab("Life Expectancy - China") + xlab("Year") + theme_bw()
holtUSA <- holt(window(USAts, end=2000), h=15)
autoplot(USAts) + autolayer(holtUSA, series = "Holts Method", PI=FALSE) + ylab("Life Expectancy - USA") + xlab("Year") + theme_bw()
However, the continued growth trend in life expectancy forecast in these models is not realistic, With this in mind we next fit a damped trend model.The damping parameter, typically set between 0.8 and .98 (at phi = 1 the model is identical to an undamped Holts method), demonstrates a great deal of sensitivity. For the China data set, a damping parameter of 0.95 provides an almost perfect forecast for the last 15 periods in the data set; for the USA data, we set the damping parameter at 0.98, an improvement over the non-damped method, but still not forecasting tightly to the actual data for those last 15 periods in the data set.
dampChina <- holt(window(Chinats, end=2000), damped = TRUE, phi = .95, h=15)
autoplot(Chinats) + autolayer(dampChina, series = "Damped Holts Method", PI=FALSE) + ylab("Life Expectancy - China") + xlab("Year") + theme_bw()
dampUSA <- holt(window(USAts, end=2000), damped = TRUE, phi = .98, h=15)
autoplot(USAts) + autolayer(dampUSA, series = "Damped Holts Method", PI=FALSE) + ylab("Life Expectancy - USA") + xlab("Year") + theme_bw()
When we compare the time series modeling methods for the China data set, both exponential smoothing and the damped Holts method (with phi=0.95) fit the data nicely and provide reasonable forecasts. The damped Holt has the lowest RMSE when we apply time series cross-validation to compare a one-step accuracy for the three models.
For the USA data set, the exponential smoothing model fits the data most effectively, followed by Holt’s undamped method. Although there is a noticeable linear trend, neither Holt model fits the forecast particularly well, with the damped model far below the actual data for the final 15 periods, and the undamped model projecting strong positive linear growth that is simply not realistic, given people can’t live forever, even in the United States. Based on time series cross-validation for a one-step forecast accuracy, the undamped Holt model has the lowest RMSE for the USA data set.
c1 <- tsCV(Chinats, ses, h=1)
c2 <- tsCV(Chinats, holt, h=1)
c3 <- tsCV(Chinats, holt, damped = TRUE, h=1)
RMSEChinaexp <- sqrt(mean(c1^2,na.rm=TRUE))
RMSEChinaHolt <- sqrt(mean(c2^2,na.rm=TRUE))
RMSEChinaHoltDamped <- sqrt(mean(c3^2,na.rm=TRUE))
cbind(RMSEChinaexp, RMSEChinaHolt, RMSEChinaHoltDamped)
## RMSEChinaexp RMSEChinaHolt RMSEChinaHoltDamped
## [1,] 0.8194999 0.3809039 0.3637784
autoplot(Chinats) + autolayer(fitted(expChina), series = "Exponential Smoothing") + autolayer(holtChina, series = "Holts Method", PI=FALSE) + autolayer(dampChina, series = "Damped Holts Method", PI=FALSE) + ylab("Life Expectancy - China") + xlab("Year") + theme_bw()
u1 <- tsCV(USAts, ses, h=1)
u2 <- tsCV(USAts, holt, h=1)
u3 <- tsCV(USAts, holt, damped = TRUE, h=1)
RMSEUSAexp <- sqrt(mean(u1^2,na.rm=TRUE))
RMSEUSAHolt <- sqrt(mean(u2^2,na.rm=TRUE))
RMSEUSAHoltDamped <- sqrt(mean(u3^2,na.rm=TRUE))
cbind(RMSEUSAexp, RMSEUSAHolt, RMSEUSAHoltDamped)
## RMSEUSAexp RMSEUSAHolt RMSEUSAHoltDamped
## [1,] 0.2952001 0.2667068 0.2695103
autoplot(USAts) + autolayer(fitted(expUSA), series = "Exponential Smoothing") + autolayer(holtUSA, series = "Holts Method", PI=FALSE) + autolayer(dampUSA, series = "Damped Holts Method", PI=FALSE) + ylab("Life Expectancy - USA") + xlab("Year") + theme_bw()
The ets function derives a model for the China data set with additive errors, additive trend, and no seasonality (seasonality not expected, given the annual nature of the data). The RMSE for this model is 0.1588. When we apply the ets framework to the USA data set, we again see a model with additive errors, additive trend, and no seasonality. The RMSE for this model is 0.2413.
etsChina <- ets(window(Chinats, end=2000))
summary(etsChina)
## ETS(A,A,N)
##
## Call:
## ets(y = window(Chinats, end = 2000))
##
## Smoothing parameters:
## alpha = 0.9999
## beta = 0.9999
##
## Initial states:
## l = 43.0976
## b = 0.2356
##
## sigma: 0.1671
##
## AIC AICc BIC
## 11.34951 13.06380 19.91737
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.004823887 0.1587653 0.1022683 0.0303117 0.1889158 0.1441593
## ACF1
## Training set 0.8860516
autoplot(Chinats) + autolayer(forecast(etsChina, h=15), PI=FALSE)+ ylab("Life Expectancy - China") + xlab("Year") + theme_bw()
etsUSA <- ets(window(USAts, end=2000))
summary(etsUSA)
## ETS(A,A,N)
##
## Call:
## ets(y = window(USAts, end = 2000))
##
## Smoothing parameters:
## alpha = 0.877
## beta = 1e-04
##
## Initial states:
## l = 69.6309
## b = 0.1713
##
## sigma: 0.254
##
## AIC AICc BIC
## 45.66362 47.37790 54.23148
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -2.799391e-05 0.2412639 0.1851057 -0.0005948967 0.254948
## MASE ACF1
## Training set 0.7726479 -0.00889037
autoplot(USAts) + autolayer(forecast(etsUSA, h=15), PI=FALSE)+ ylab("Life Expectancy - USA") + xlab("Year") + theme_bw()
For these data sets, the ets framework generates models that have the lowest RMSE for both the China and the USA data sets. At a macro, world-view perspective, traditional multiple linear regression modelling provides useful insight into some of the economic elements that are related to life expectancy writ large.