Set working directory & read in data:

setwd("~/Documents/geog6000/lab06")
gap <- read.csv("../datafiles/gapminderData5.csv")

Use tapply() to estimate the mean life expectanct by country

tapply(gap$lifeExp, gap$country, mean)
##              Afghanistan                  Albania                  Algeria 
##                 37.47883                 68.43292                 59.03017 
##                   Angola                Argentina                Australia 
##                 37.88350                 69.06042                 74.66292 
##                  Austria                  Bahrain               Bangladesh 
##                 73.10325                 65.60567                 49.83408 
##                  Belgium                    Benin                  Bolivia 
##                 73.64175                 48.77992                 52.50458 
##   Bosnia and Herzegovina                 Botswana                   Brazil 
##                 67.70783                 54.59750                 62.23950 
##                 Bulgaria             Burkina Faso                  Burundi 
##                 69.74375                 44.69400                 44.81733 
##                 Cambodia                 Cameroon                   Canada 
##                 47.90275                 48.12850                 74.90275 
## Central African Republic                     Chad                    Chile 
##                 43.86692                 46.77358                 67.43092 
##                    China                 Colombia                  Comoros 
##                 61.78514                 63.89775                 52.38175 
##         Congo, Dem. Rep.              Congo, Rep.               Costa Rica 
##                 44.54375                 52.50192                 70.18142 
##            Cote d'Ivoire                  Croatia                     Cuba 
##                 48.43617                 70.05592                 71.04508 
##           Czech Republic                  Denmark                 Djibouti 
##                 71.51050                 74.37017                 46.38075 
##       Dominican Republic                  Ecuador                    Egypt 
##                 61.55450                 62.81683                 56.24300 
##              El Salvador        Equatorial Guinea                  Eritrea 
##                 59.63333                 42.96000                 45.99925 
##                 Ethiopia                  Finland                   France 
##                 44.47575                 72.99192                 74.34892 
##                    Gabon                   Gambia                  Germany 
##                 51.22050                 44.40058                 73.44442 
##                    Ghana                   Greece                Guatemala 
##                 52.34067                 73.73317                 56.72942 
##                   Guinea            Guinea-Bissau                    Haiti 
##                 43.23983                 39.21025                 50.16525 
##                 Honduras         Hong Kong, China                  Hungary 
##                 57.92083                 73.49283                 69.39317 
##                  Iceland                    India                Indonesia 
##                 76.51142                 53.16608                 54.33575 
##                     Iran                     Iraq                  Ireland 
##                 58.63658                 56.58175                 73.01725 
##                   Israel                    Italy                  Jamaica 
##                 73.64583                 74.01383                 68.74933 
##                    Japan                   Jordan                    Kenya 
##                 74.82692                 59.78642                 52.68100 
##         Korea, Dem. Rep.              Korea, Rep.                   Kuwait 
##                 63.60733                 65.00100                 68.92233 
##                  Lebanon                  Lesotho                  Liberia 
##                 65.86567                 50.00708                 42.47625 
##                    Libya               Madagascar                   Malawi 
##                 59.30417                 47.77058                 43.35158 
##                 Malaysia                     Mali               Mauritania 
##                 64.27958                 43.41350                 52.30208 
##                Mauritius                   Mexico                 Mongolia 
##                 64.95325                 65.40883                 55.89033 
##               Montenegro                  Morocco               Mozambique 
##                 70.29917                 57.60883                 40.37950 
##                  Myanmar                  Namibia                    Nepal 
##                 53.32167                 53.49133                 48.98633 
##              Netherlands              New Zealand                Nicaragua 
##                 75.64850                 73.98950                 58.34942 
##                    Niger                  Nigeria                   Norway 
##                 44.55867                 43.58133                 75.84300 
##                     Oman                 Pakistan                   Panama 
##                 58.44267                 54.88225                 67.80175 
##                 Paraguay                     Peru              Philippines 
##                 66.80908                 58.85933                 60.96725 
##                   Poland                 Portugal              Puerto Rico 
##                 70.17692                 70.41983                 72.73933 
##                  Reunion                  Romania                   Rwanda 
##                 66.64425                 68.29067                 41.48158 
##    Sao Tome and Principe             Saudi Arabia                  Senegal 
##                 57.89633                 58.67875                 50.62592 
##                   Serbia             Sierra Leone                Singapore 
##                 68.55100                 36.76917                 71.22025 
##          Slovak Republic                 Slovenia                  Somalia 
##                 70.69608                 71.60075                 40.98867 
##             South Africa                    Spain                Sri Lanka 
##                 53.99317                 74.20342                 66.52608 
##                    Sudan                Swaziland                   Sweden 
##                 48.40050                 49.00242                 76.17700 
##              Switzerland                    Syria                   Taiwan 
##                 75.56508                 61.34617                 70.33667 
##                 Tanzania                 Thailand                     Togo 
##                 47.91233                 62.20025                 51.49875 
##      Trinidad and Tobago                  Tunisia                   Turkey 
##                 66.82800                 60.72100                 59.69642 
##                   Uganda           United Kingdom            United States 
##                 47.61883                 73.92258                 73.47850 
##                  Uruguay                Venezuela                  Vietnam 
##                 70.78158                 66.58067                 57.47950 
##       West Bank and Gaza              Yemen, Rep.                   Zambia 
##                 60.32867                 46.78042                 45.99633 
##                 Zimbabwe 
##                 52.66317

Scatter plot of life expectancy versus year:

library(ggplot2)
myplot = ggplot(gap,aes(x = year, y = lifeExp, group = country))
myplot = myplot + geom_point() + facet_wrap("country")

Plot will be attached as a pdf.

##Why didn’t this work?

plot(gap\(lifeExp, gap\)year, main = “Life Expectancy & Year”, xlab = “Life Expectancy”, ylab = “Year”)

Plot life expectancy vs year for two countries: Iraq & Italy

gap_sub_iraq <- subset(gap, country == 'Iraq')
plot(gap_sub_iraq$year, gap_sub_iraq$lifeExp,
     xlab = "Year", ylab = "Life Expectancy (yr)", main = "Life Expectancy in Iraq Over Time")

gap_sub_italy <- subset(gap, country == 'Italy')
plot(gap_sub_italy$year, gap_sub_italy$lifeExp,
     xlab = "Year", ylab = "Life Expectancy (yr)", main = "Life Expectancy in Italy Over Time")

Life expectancy as a function of (centered) year: simple linear model

CenYr <- gap$year - 1952

lifeExplm <- lm(lifeExp ~ CenYr, data = gap)
summary(lifeExplm)
## 
## Call:
## lm(formula = lifeExp ~ CenYr, data = gap)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.949  -9.651   1.697  10.335  22.158 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 50.51208    0.53000   95.31   <2e-16 ***
## CenYr        0.32590    0.01632   19.96   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.63 on 1702 degrees of freedom
## Multiple R-squared:  0.1898, Adjusted R-squared:  0.1893 
## F-statistic: 398.6 on 1 and 1702 DF,  p-value: < 2.2e-16

Coefficients & Significance:

Intercept: 50.51208 & <2e-16
CenYr: 0.3259 & <2e-16

Brief Explanation:

The average life expectancy for year “0” (1952 in this case because the mean is centered) is 50.51208 years old. As time goes on, the slope increases at a rate of 0.3259. This means that life expectancy increases by 0.3259 years for each passing year (or negatively if you go back in time). Both of these coefficents have significant p-values meaning that there is a significant relationship between life expectancy and the year.

F-statistic: 398.6
P-value: 2.2e-16

Random Intercept Model

library(nlme)
gap.lme <- lme(lifeExp ~ CenYr, 
               random = ~ 1 | country,
               data = gap)
summary(gap.lme)
## Linear mixed-effects model fit by REML
##  Data: gap 
##        AIC      BIC    logLik
##   9874.263 9896.022 -4933.132
## 
## Random effects:
##  Formula: ~1 | country
##         (Intercept) Residual
## StdDev:    11.09716 3.584196
## 
## Fixed effects: lifeExp ~ CenYr 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 50.51208 0.9454671 1561 53.42553       0
## CenYr        0.32590 0.0050305 1561 64.78581       0
##  Correlation: 
##       (Intr)
## CenYr -0.146
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -6.1691513 -0.4910339  0.1216455  0.6030002  2.4194109 
## 
## Number of Observations: 1704
## Number of Groups: 142

Coefficients & p- values:
50.512, p-value = 0
0.32590, p-value = 0
Standard Deviation:
Intercept: 11.09716
Residual: 3.584196

ICC

VarCorr(gap.lme)
## country = pdLogChol(1) 
##             Variance  StdDev   
## (Intercept) 123.14688 11.097156
## Residual     12.84646  3.584196

ICC:
Total variance: 135.99
ICC/Explained variance:
Intercept: 90.6%
Residual: 9.4%

Does this support the use of a random intercept model?
It is because the explained variation for the intercept is greater than 5%.

Model w/ random intercept & slope

gap.lme2 <- lme(lifeExp ~ CenYr, 
               random = ~ CenYr | country,
               data = gap)
summary(gap.lme2)
## Linear mixed-effects model fit by REML
##  Data: gap 
##        AIC      BIC    logLik
##   8747.277 8779.915 -4367.639
## 
## Random effects:
##  Formula: ~CenYr | country
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 12.3027957 (Intr)
## CenYr        0.1582867 -0.434
## Residual     2.1807917       
## 
## Fixed effects: lifeExp ~ CenYr 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 50.51208 1.0371995 1561 48.70045       0
## CenYr        0.32590 0.0136312 1561 23.90865       0
##  Correlation: 
##       (Intr)
## CenYr -0.439
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -8.08192929 -0.34891405  0.02690066  0.39742349  4.67388085 
## 
## Number of Observations: 1704
## Number of Groups: 142

AIC values
Which model is best?

AIC(lifeExplm, gap.lme, gap.lme2)
## Warning in AIC.default(lifeExplm, gap.lme, gap.lme2): models are not all
## fitted to the same number of observations
##           df       AIC
## lifeExplm  3 13201.732
## gap.lme    4  9874.263
## gap.lme2   6  8747.277

The model with a random intercept and slope is the best because it has the lowest AIC of the 3 models.