Stat 545A Homework 3

Start by loading libraries and Data.

library(plyr)
library(xtable)
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)

Maximum and minimum of GDP per capita (“wide” format) sorted by minimum GDP:

maxMinGdpByCont <- ddply(gDat, ~continent, summarize, maxGdp = max(gdpPercap), 
    minGdp = min(gdpPercap))
attach(maxMinGdpByCont)
sortedByMinGdpTbl <- xtable(maxMinGdpByCont[order(minGdp), ])
print(sortedByMinGdpTbl, type = "html", include.rownames = FALSE)
continent maxGdp minGdp
Africa 21951.21 241.17
Asia 113523.13 331.00
Europe 49357.19 973.53
Americas 42951.65 1201.64
Oceania 34435.37 10039.60

Maximum and minimum of GDP per capita (“wide” format) sorted by maximum GDP:

sortedByMaxGdpTbl <- xtable(maxMinGdpByCont[order(maxGdp), ])
detach(maxMinGdpByCont)
print(sortedByMaxGdpTbl, type = "html", include.rownames = FALSE)
continent maxGdp minGdp
Africa 21951.21 241.17
Oceania 34435.37 10039.60
Americas 42951.65 1201.64
Europe 49357.19 973.53
Asia 113523.13 331.00

Africa has the lowest max and minimum GDP. Oceania with its small number of actual entries is does not have an especially high or low maximum or minimum GDP. Asia evidently has a huge discreprency in its GDP distribution, likely due the having many entries (opposite of Oceania). Americas and Europe have very similar max and min GDP and don't stray far from eachother in the rankings.

Maximum and minimum of GDP per capita for all continents (“tall” format):

maxMinGdpByContTall <- ddply(gDat, ~continent, summarize, factor = c("min", 
    "max"), GDP = c(min = min(gdpPercap), max = max(gdpPercap)))
tallGdpTbl <- xtable(maxMinGdpByContTall)
print(tallGdpTbl, type = "html", include.rownames = FALSE)
continent factor GDP
Africa min 241.17
Africa max 21951.21
Americas min 1201.64
Americas max 42951.65
Asia min 331.00
Asia max 113523.13
Europe min 973.53
Europe max 49357.19
Oceania min 10039.60
Oceania max 34435.37

Spread of GDP per capita within the continents:

spreadStats <- ddply(gDat, ~continent, summarize, gdpStdDev = sd(gdpPercap), 
    gdpVar = var(gdpPercap), gdpIQR = IQR(gdpPercap))
attach(spreadStats)
spreadTbl <- xtable(spreadStats[order(gdpStdDev), ])
detach(spreadStats)
print(spreadTbl, type = "html", include.rownames = FALSE)
continent gdpStdDev gdpVar gdpIQR
Africa 2827.93 7997187.31 1616.17
Oceania 6358.98 40436668.87 8072.26
Americas 6396.76 40918591.10 4402.43
Europe 9355.21 87520019.60 13248.30
Asia 14045.37 197272505.85 7492.26

Africa, with its overall low GDP seems to have the smallest spread in GDP and Asia clearly has the highest. Unexpectly Oceania and the Americas similar spread depsite having quite differnt min and max GDP. Similarily Europe has quite a bit larger spread than the Americas despite having similar minimum and maximum GDP.

Trimmed mean of life expectancy for different years:

I picked a fraction 0.1 to trim.

lifeExpMeanByYear <- ddply(gDat, ~year, summarize, meanLifeExp = mean(lifeExp, 
    trim = 0.1))
lifeExpMeanByYearTbl <- xtable(lifeExpMeanByYear)
print(lifeExpMeanByYearTbl, type = "html", include.rownames = FALSE)
year meanLifeExp
1952 48.58
1957 51.27
1962 53.58
1967 55.87
1972 58.01
1977 60.10
1982 62.12
1987 63.92
1992 65.19
1997 66.02
2002 66.72
2007 68.11

Life expectancy changing over time on different continents:

I didn't trim this time.

lifeExpMeanByYearCont <- ddply(gDat, ~continent ~ year, summarize, meanLifeExp = mean(lifeExp), 
    medianLifeExp = median(lifeExp))
lifeExpMeanByYCTbl <- xtable(lifeExpMeanByYearCont)
print(lifeExpMeanByYCTbl, type = "html", include.rownames = FALSE)
continent year meanLifeExp medianLifeExp
Africa 1952 39.14 38.83
Africa 1957 41.27 40.59
Africa 1962 43.32 42.63
Africa 1967 45.33 44.70
Africa 1972 47.45 47.03
Africa 1977 49.58 49.27
Africa 1982 51.59 50.76
Africa 1987 53.34 51.64
Africa 1992 53.63 52.43
Africa 1997 53.60 52.76
Africa 2002 53.33 51.24
Africa 2007 54.81 52.93
Americas 1952 53.28 54.74
Americas 1957 55.96 56.07
Americas 1962 58.40 58.30
Americas 1967 60.41 60.52
Americas 1972 62.39 63.44
Americas 1977 64.39 66.35
Americas 1982 66.23 67.41
Americas 1987 68.09 69.50
Americas 1992 69.57 69.86
Americas 1997 71.15 72.15
Americas 2002 72.42 72.05
Americas 2007 73.61 72.90
Asia 1952 46.31 44.87
Asia 1957 49.32 48.28
Asia 1962 51.56 49.33
Asia 1967 54.66 53.66
Asia 1972 57.32 56.95
Asia 1977 59.61 60.77
Asia 1982 62.62 63.74
Asia 1987 64.85 66.30
Asia 1992 66.54 68.69
Asia 1997 68.02 70.27
Asia 2002 69.23 71.03
Asia 2007 70.73 72.40
Europe 1952 64.41 65.90
Europe 1957 66.70 67.65
Europe 1962 68.54 69.53
Europe 1967 69.74 70.61
Europe 1972 70.78 70.89
Europe 1977 71.94 72.34
Europe 1982 72.81 73.49
Europe 1987 73.64 74.81
Europe 1992 74.44 75.45
Europe 1997 75.51 76.12
Europe 2002 76.70 77.54
Europe 2007 77.65 78.61
Oceania 1952 69.25 69.25
Oceania 1957 70.30 70.30
Oceania 1962 71.09 71.09
Oceania 1967 71.31 71.31
Oceania 1972 71.91 71.91
Oceania 1977 72.85 72.85
Oceania 1982 74.29 74.29
Oceania 1987 75.32 75.32
Oceania 1992 76.94 76.94
Oceania 1997 78.19 78.19
Oceania 2002 79.74 79.74
Oceania 2007 80.72 80.72

Number of countries with low life expectancy over time by continent:

I picked life expectancies lower than 68 (mean life expectancy in 2007) as a reference for low life expectancies.

lifeExpLowerThan68 <- ddply(gDat, ~continent ~ year, summarize, countryCount = length(lifeExp[lifeExp < 
    68]))
lifeExpLowerThan68Tbl <- xtable(lifeExpLowerThan68)
print(lifeExpLowerThan68Tbl, type = "html", include.rownames = FALSE)
continent year countryCount
Africa 1952 52
Africa 1957 52
Africa 1962 52
Africa 1967 52
Africa 1972 52
Africa 1977 52
Africa 1982 51
Africa 1987 50
Africa 1992 48
Africa 1997 47
Africa 2002 45
Africa 2007 45
Americas 1952 23
Americas 1957 22
Americas 1962 21
Americas 1967 20
Americas 1972 19
Americas 1977 15
Americas 1982 13
Americas 1987 12
Americas 1992 8
Americas 1997 4
Americas 2002 2
Americas 2007 2
Asia 1952 33
Asia 1957 33
Asia 1962 31
Asia 1967 30
Asia 1972 28
Asia 1977 27
Asia 1982 23
Asia 1987 22
Asia 1992 15
Asia 1997 13
Asia 2002 11
Asia 2007 11
Europe 1952 22
Europe 1957 18
Europe 1962 10
Europe 1967 7
Europe 1972 3
Europe 1977 1
Europe 1982 1
Europe 1987 1
Europe 1992 1
Europe 1997 0
Europe 2002 0
Europe 2007 0
Oceania 1952 0
Oceania 1957 0
Oceania 1962 0
Oceania 1967 0
Oceania 1972 0
Oceania 1977 0
Oceania 1982 0
Oceania 1987 0
Oceania 1992 0
Oceania 1997 0
Oceania 2002 0
Oceania 2007 0

Proportion of countries with low life expectancy over time by continent

I implimented this with a function rather than just summarize for practice.

getLifeExpLowerThan68Prop <- function(x) {
    numLower <- length(x$lifeExp[x$lifeExp < 68])
    total <- length(x$lifeExp)
    countryCount <- numLower/total
    names(countryCount) <- "countryCount"
    return(countryCount)
}

lifeExpLowerThan68Prop <- ddply(gDat, ~continent ~ year, getLifeExpLowerThan68Prop)
lifeExpLowerThan68PropTbl <- xtable(lifeExpLowerThan68Prop)
print(lifeExpLowerThan68PropTbl, type = "html", include.rownames = FALSE)
continent year countryCount
Africa 1952 1.00
Africa 1957 1.00
Africa 1962 1.00
Africa 1967 1.00
Africa 1972 1.00
Africa 1977 1.00
Africa 1982 0.98
Africa 1987 0.96
Africa 1992 0.92
Africa 1997 0.90
Africa 2002 0.87
Africa 2007 0.87
Americas 1952 0.92
Americas 1957 0.88
Americas 1962 0.84
Americas 1967 0.80
Americas 1972 0.76
Americas 1977 0.60
Americas 1982 0.52
Americas 1987 0.48
Americas 1992 0.32
Americas 1997 0.16
Americas 2002 0.08
Americas 2007 0.08
Asia 1952 1.00
Asia 1957 1.00
Asia 1962 0.94
Asia 1967 0.91
Asia 1972 0.85
Asia 1977 0.82
Asia 1982 0.70
Asia 1987 0.67
Asia 1992 0.45
Asia 1997 0.39
Asia 2002 0.33
Asia 2007 0.33
Europe 1952 0.73
Europe 1957 0.60
Europe 1962 0.33
Europe 1967 0.23
Europe 1972 0.10
Europe 1977 0.03
Europe 1982 0.03
Europe 1987 0.03
Europe 1992 0.03
Europe 1997 0.00
Europe 2002 0.00
Europe 2007 0.00
Oceania 1952 0.00
Oceania 1957 0.00
Oceania 1962 0.00
Oceania 1967 0.00
Oceania 1972 0.00
Oceania 1977 0.00
Oceania 1982 0.00
Oceania 1987 0.00
Oceania 1992 0.00
Oceania 1997 0.00
Oceania 2002 0.00
Oceania 2007 0.00