Stat 545A Homework 3
Start by loading libraries and Data.
library(plyr)
library(xtable)
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)
Maximum and minimum of GDP per capita (“wide” format) sorted by minimum GDP:
maxMinGdpByCont <- ddply(gDat, ~continent, summarize, maxGdp = max(gdpPercap),
minGdp = min(gdpPercap))
attach(maxMinGdpByCont)
sortedByMinGdpTbl <- xtable(maxMinGdpByCont[order(minGdp), ])
print(sortedByMinGdpTbl, type = "html", include.rownames = FALSE)
| continent | maxGdp | minGdp |
| Africa | 21951.21 | 241.17 |
| Asia | 113523.13 | 331.00 |
| Europe | 49357.19 | 973.53 |
| Americas | 42951.65 | 1201.64 |
| Oceania | 34435.37 | 10039.60 |
Maximum and minimum of GDP per capita (“wide” format) sorted by maximum GDP:
sortedByMaxGdpTbl <- xtable(maxMinGdpByCont[order(maxGdp), ])
detach(maxMinGdpByCont)
print(sortedByMaxGdpTbl, type = "html", include.rownames = FALSE)
| continent | maxGdp | minGdp |
| Africa | 21951.21 | 241.17 |
| Oceania | 34435.37 | 10039.60 |
| Americas | 42951.65 | 1201.64 |
| Europe | 49357.19 | 973.53 |
| Asia | 113523.13 | 331.00 |
Africa has the lowest max and minimum GDP. Oceania with its small number of actual entries is does not have an especially high or low maximum or minimum GDP. Asia evidently has a huge discreprency in its GDP distribution, likely due the having many entries (opposite of Oceania). Americas and Europe have very similar max and min GDP and don't stray far from eachother in the rankings.
Maximum and minimum of GDP per capita for all continents (“tall” format):
maxMinGdpByContTall <- ddply(gDat, ~continent, summarize, factor = c("min",
"max"), GDP = c(min = min(gdpPercap), max = max(gdpPercap)))
tallGdpTbl <- xtable(maxMinGdpByContTall)
print(tallGdpTbl, type = "html", include.rownames = FALSE)
| continent | factor | GDP |
| Africa | min | 241.17 |
| Africa | max | 21951.21 |
| Americas | min | 1201.64 |
| Americas | max | 42951.65 |
| Asia | min | 331.00 |
| Asia | max | 113523.13 |
| Europe | min | 973.53 |
| Europe | max | 49357.19 |
| Oceania | min | 10039.60 |
| Oceania | max | 34435.37 |
Spread of GDP per capita within the continents:
spreadStats <- ddply(gDat, ~continent, summarize, gdpStdDev = sd(gdpPercap),
gdpVar = var(gdpPercap), gdpIQR = IQR(gdpPercap))
attach(spreadStats)
spreadTbl <- xtable(spreadStats[order(gdpStdDev), ])
detach(spreadStats)
print(spreadTbl, type = "html", include.rownames = FALSE)
| continent | gdpStdDev | gdpVar | gdpIQR |
| Africa | 2827.93 | 7997187.31 | 1616.17 |
| Oceania | 6358.98 | 40436668.87 | 8072.26 |
| Americas | 6396.76 | 40918591.10 | 4402.43 |
| Europe | 9355.21 | 87520019.60 | 13248.30 |
| Asia | 14045.37 | 197272505.85 | 7492.26 |
Africa, with its overall low GDP seems to have the smallest spread in GDP and Asia clearly has the highest. Unexpectly Oceania and the Americas similar spread depsite having quite differnt min and max GDP. Similarily Europe has quite a bit larger spread than the Americas despite having similar minimum and maximum GDP.
Trimmed mean of life expectancy for different years:
I picked a fraction 0.1 to trim.
lifeExpMeanByYear <- ddply(gDat, ~year, summarize, meanLifeExp = mean(lifeExp,
trim = 0.1))
lifeExpMeanByYearTbl <- xtable(lifeExpMeanByYear)
print(lifeExpMeanByYearTbl, type = "html", include.rownames = FALSE)
| year | meanLifeExp |
| 1952 | 48.58 |
| 1957 | 51.27 |
| 1962 | 53.58 |
| 1967 | 55.87 |
| 1972 | 58.01 |
| 1977 | 60.10 |
| 1982 | 62.12 |
| 1987 | 63.92 |
| 1992 | 65.19 |
| 1997 | 66.02 |
| 2002 | 66.72 |
| 2007 | 68.11 |
Life expectancy changing over time on different continents:
I didn't trim this time.
lifeExpMeanByYearCont <- ddply(gDat, ~continent ~ year, summarize, meanLifeExp = mean(lifeExp),
medianLifeExp = median(lifeExp))
lifeExpMeanByYCTbl <- xtable(lifeExpMeanByYearCont)
print(lifeExpMeanByYCTbl, type = "html", include.rownames = FALSE)
| continent | year | meanLifeExp | medianLifeExp |
| Africa | 1952 | 39.14 | 38.83 |
| Africa | 1957 | 41.27 | 40.59 |
| Africa | 1962 | 43.32 | 42.63 |
| Africa | 1967 | 45.33 | 44.70 |
| Africa | 1972 | 47.45 | 47.03 |
| Africa | 1977 | 49.58 | 49.27 |
| Africa | 1982 | 51.59 | 50.76 |
| Africa | 1987 | 53.34 | 51.64 |
| Africa | 1992 | 53.63 | 52.43 |
| Africa | 1997 | 53.60 | 52.76 |
| Africa | 2002 | 53.33 | 51.24 |
| Africa | 2007 | 54.81 | 52.93 |
| Americas | 1952 | 53.28 | 54.74 |
| Americas | 1957 | 55.96 | 56.07 |
| Americas | 1962 | 58.40 | 58.30 |
| Americas | 1967 | 60.41 | 60.52 |
| Americas | 1972 | 62.39 | 63.44 |
| Americas | 1977 | 64.39 | 66.35 |
| Americas | 1982 | 66.23 | 67.41 |
| Americas | 1987 | 68.09 | 69.50 |
| Americas | 1992 | 69.57 | 69.86 |
| Americas | 1997 | 71.15 | 72.15 |
| Americas | 2002 | 72.42 | 72.05 |
| Americas | 2007 | 73.61 | 72.90 |
| Asia | 1952 | 46.31 | 44.87 |
| Asia | 1957 | 49.32 | 48.28 |
| Asia | 1962 | 51.56 | 49.33 |
| Asia | 1967 | 54.66 | 53.66 |
| Asia | 1972 | 57.32 | 56.95 |
| Asia | 1977 | 59.61 | 60.77 |
| Asia | 1982 | 62.62 | 63.74 |
| Asia | 1987 | 64.85 | 66.30 |
| Asia | 1992 | 66.54 | 68.69 |
| Asia | 1997 | 68.02 | 70.27 |
| Asia | 2002 | 69.23 | 71.03 |
| Asia | 2007 | 70.73 | 72.40 |
| Europe | 1952 | 64.41 | 65.90 |
| Europe | 1957 | 66.70 | 67.65 |
| Europe | 1962 | 68.54 | 69.53 |
| Europe | 1967 | 69.74 | 70.61 |
| Europe | 1972 | 70.78 | 70.89 |
| Europe | 1977 | 71.94 | 72.34 |
| Europe | 1982 | 72.81 | 73.49 |
| Europe | 1987 | 73.64 | 74.81 |
| Europe | 1992 | 74.44 | 75.45 |
| Europe | 1997 | 75.51 | 76.12 |
| Europe | 2002 | 76.70 | 77.54 |
| Europe | 2007 | 77.65 | 78.61 |
| Oceania | 1952 | 69.25 | 69.25 |
| Oceania | 1957 | 70.30 | 70.30 |
| Oceania | 1962 | 71.09 | 71.09 |
| Oceania | 1967 | 71.31 | 71.31 |
| Oceania | 1972 | 71.91 | 71.91 |
| Oceania | 1977 | 72.85 | 72.85 |
| Oceania | 1982 | 74.29 | 74.29 |
| Oceania | 1987 | 75.32 | 75.32 |
| Oceania | 1992 | 76.94 | 76.94 |
| Oceania | 1997 | 78.19 | 78.19 |
| Oceania | 2002 | 79.74 | 79.74 |
| Oceania | 2007 | 80.72 | 80.72 |
Number of countries with low life expectancy over time by continent:
I picked life expectancies lower than 68 (mean life expectancy in 2007) as a reference for low life expectancies.
lifeExpLowerThan68 <- ddply(gDat, ~continent ~ year, summarize, countryCount = length(lifeExp[lifeExp <
68]))
lifeExpLowerThan68Tbl <- xtable(lifeExpLowerThan68)
print(lifeExpLowerThan68Tbl, type = "html", include.rownames = FALSE)
| continent | year | countryCount |
| Africa | 1952 | 52 |
| Africa | 1957 | 52 |
| Africa | 1962 | 52 |
| Africa | 1967 | 52 |
| Africa | 1972 | 52 |
| Africa | 1977 | 52 |
| Africa | 1982 | 51 |
| Africa | 1987 | 50 |
| Africa | 1992 | 48 |
| Africa | 1997 | 47 |
| Africa | 2002 | 45 |
| Africa | 2007 | 45 |
| Americas | 1952 | 23 |
| Americas | 1957 | 22 |
| Americas | 1962 | 21 |
| Americas | 1967 | 20 |
| Americas | 1972 | 19 |
| Americas | 1977 | 15 |
| Americas | 1982 | 13 |
| Americas | 1987 | 12 |
| Americas | 1992 | 8 |
| Americas | 1997 | 4 |
| Americas | 2002 | 2 |
| Americas | 2007 | 2 |
| Asia | 1952 | 33 |
| Asia | 1957 | 33 |
| Asia | 1962 | 31 |
| Asia | 1967 | 30 |
| Asia | 1972 | 28 |
| Asia | 1977 | 27 |
| Asia | 1982 | 23 |
| Asia | 1987 | 22 |
| Asia | 1992 | 15 |
| Asia | 1997 | 13 |
| Asia | 2002 | 11 |
| Asia | 2007 | 11 |
| Europe | 1952 | 22 |
| Europe | 1957 | 18 |
| Europe | 1962 | 10 |
| Europe | 1967 | 7 |
| Europe | 1972 | 3 |
| Europe | 1977 | 1 |
| Europe | 1982 | 1 |
| Europe | 1987 | 1 |
| Europe | 1992 | 1 |
| Europe | 1997 | 0 |
| Europe | 2002 | 0 |
| Europe | 2007 | 0 |
| Oceania | 1952 | 0 |
| Oceania | 1957 | 0 |
| Oceania | 1962 | 0 |
| Oceania | 1967 | 0 |
| Oceania | 1972 | 0 |
| Oceania | 1977 | 0 |
| Oceania | 1982 | 0 |
| Oceania | 1987 | 0 |
| Oceania | 1992 | 0 |
| Oceania | 1997 | 0 |
| Oceania | 2002 | 0 |
| Oceania | 2007 | 0 |
Proportion of countries with low life expectancy over time by continent
I implimented this with a function rather than just summarize for practice.
getLifeExpLowerThan68Prop <- function(x) {
numLower <- length(x$lifeExp[x$lifeExp < 68])
total <- length(x$lifeExp)
countryCount <- numLower/total
names(countryCount) <- "countryCount"
return(countryCount)
}
lifeExpLowerThan68Prop <- ddply(gDat, ~continent ~ year, getLifeExpLowerThan68Prop)
lifeExpLowerThan68PropTbl <- xtable(lifeExpLowerThan68Prop)
print(lifeExpLowerThan68PropTbl, type = "html", include.rownames = FALSE)
| continent | year | countryCount |
| Africa | 1952 | 1.00 |
| Africa | 1957 | 1.00 |
| Africa | 1962 | 1.00 |
| Africa | 1967 | 1.00 |
| Africa | 1972 | 1.00 |
| Africa | 1977 | 1.00 |
| Africa | 1982 | 0.98 |
| Africa | 1987 | 0.96 |
| Africa | 1992 | 0.92 |
| Africa | 1997 | 0.90 |
| Africa | 2002 | 0.87 |
| Africa | 2007 | 0.87 |
| Americas | 1952 | 0.92 |
| Americas | 1957 | 0.88 |
| Americas | 1962 | 0.84 |
| Americas | 1967 | 0.80 |
| Americas | 1972 | 0.76 |
| Americas | 1977 | 0.60 |
| Americas | 1982 | 0.52 |
| Americas | 1987 | 0.48 |
| Americas | 1992 | 0.32 |
| Americas | 1997 | 0.16 |
| Americas | 2002 | 0.08 |
| Americas | 2007 | 0.08 |
| Asia | 1952 | 1.00 |
| Asia | 1957 | 1.00 |
| Asia | 1962 | 0.94 |
| Asia | 1967 | 0.91 |
| Asia | 1972 | 0.85 |
| Asia | 1977 | 0.82 |
| Asia | 1982 | 0.70 |
| Asia | 1987 | 0.67 |
| Asia | 1992 | 0.45 |
| Asia | 1997 | 0.39 |
| Asia | 2002 | 0.33 |
| Asia | 2007 | 0.33 |
| Europe | 1952 | 0.73 |
| Europe | 1957 | 0.60 |
| Europe | 1962 | 0.33 |
| Europe | 1967 | 0.23 |
| Europe | 1972 | 0.10 |
| Europe | 1977 | 0.03 |
| Europe | 1982 | 0.03 |
| Europe | 1987 | 0.03 |
| Europe | 1992 | 0.03 |
| Europe | 1997 | 0.00 |
| Europe | 2002 | 0.00 |
| Europe | 2007 | 0.00 |
| Oceania | 1952 | 0.00 |
| Oceania | 1957 | 0.00 |
| Oceania | 1962 | 0.00 |
| Oceania | 1967 | 0.00 |
| Oceania | 1972 | 0.00 |
| Oceania | 1977 | 0.00 |
| Oceania | 1982 | 0.00 |
| Oceania | 1987 | 0.00 |
| Oceania | 1992 | 0.00 |
| Oceania | 1997 | 0.00 |
| Oceania | 2002 | 0.00 |
| Oceania | 2007 | 0.00 |