Jack Ni
Importing the Gapminder dataset from Jenny's website. Doing a quick check to see if the import went fine.
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)
str(gDat)
## 'data.frame': 1704 obs. of 6 variables:
## $ country : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
## $ pop : num 8425333 9240934 10267083 11537966 13079460 ...
## $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ lifeExp : num 28.8 30.3 32 34 36.1 ...
## $ gdpPercap: num 779 821 853 836 740 ...
Loading the “plyr”, “xtable”, and “lattice” package.
library(plyr)
library(xtable)
library(lattice)
Here, I find the maximum and minimum GDP per capita for each continent. The data frame is sorted by max GDP per capita. Looking at the data, there seems to be a general trend where having a higher maximum GPD per capita also results in that continent having a lower minimum, with Africa being the exception. Since the GDP per capita records are taken by country and by year, this means that there is a larger gap between either the countries or the years in Asia and Europe as compared to Oceania and Americas.
maxMinGdpByCont <- ddply(gDat, ~continent, summarize, maxGdpPerCap = max(gdpPercap),
minGdpPerCap = min(gdpPercap))
maxMinGdpByCont <- arrange(maxMinGdpByCont, maxGdpPerCap)
colnames(maxMinGdpByCont) <- c("continent", "max GDP per capita", "min GDP per capita")
maxMinGdpByCont <- xtable(maxMinGdpByCont)
print(maxMinGdpByCont, type = "html", include.rownames = FALSE)
| continent | max GDP per capita | min GDP per capita |
|---|---|---|
| Africa | 21951.21 | 241.17 |
| Oceania | 34435.37 | 10039.60 |
| Americas | 42951.65 | 1201.64 |
| Europe | 49357.19 | 973.53 |
| Asia | 113523.13 | 331.00 |
This expands upon the above example. Each min and max value has a separate row and is labelled as such.
minMaxGdp <- function(x) {
xMin <- min(x$gdpPercap)
xMax <- max(x$gdpPercap)
makeMatrix <- matrix(c("Min", "Max", xMin, xMax), nrow = 2, ncol = 2)
colnames(makeMatrix) <- c("statistic", "value")
return(makeMatrix)
}
minMaxGdpByContSep <- ddply(gDat, ~continent, minMaxGdp)
minMaxGdpByContSep <- xtable(minMaxGdpByContSep)
print(minMaxGdpByContSep, type = "html", include.rownames = FALSE)
| continent | statistic | value |
|---|---|---|
| Africa | Min | 241.1658765 |
| Africa | Max | 21951.21176 |
| Americas | Min | 1201.637154 |
| Americas | Max | 42951.65309 |
| Asia | Min | 331 |
| Asia | Max | 113523.1329 |
| Europe | Min | 973.5331948 |
| Europe | Max | 49357.19017 |
| Oceania | Min | 10039.59564 |
| Oceania | Max | 34435.36744 |
The mean life expectancy globally increases from 1952 to 2007. Removing the lowest and highest 10% of life expectancy records gives a slightly lower mean up til 1977 at which point the trimmed mean is higher.
meanLeByYear <- ddply(gDat, ~year, summarize, meanLe = mean(lifeExp), trimMeanLe = mean(lifeExp,
trim = 0.1))
colnames(meanLeByYear) <- c("year", "mean", "trimmed_mean")
meanLeByYear <- xtable(meanLeByYear)
print(meanLeByYear, type = "html", include.rownames = FALSE)
| year | mean | trimmed_mean |
|---|---|---|
| 1952 | 49.06 | 48.58 |
| 1957 | 51.51 | 51.27 |
| 1962 | 53.61 | 53.58 |
| 1967 | 55.68 | 55.87 |
| 1972 | 57.65 | 58.01 |
| 1977 | 59.57 | 60.10 |
| 1982 | 61.53 | 62.12 |
| 1987 | 63.21 | 63.92 |
| 1992 | 64.16 | 65.19 |
| 1997 | 65.01 | 66.02 |
| 2002 | 65.69 | 66.72 |
| 2007 | 67.01 | 68.11 |
This shows the average life expectancy by year per continent. The average goes increases by year and this trend is the same for every continent. However, this data is hard to assess by looking due to its layout.
leByContYear <- ddply(gDat, .(continent, year), summarize, meanLe = mean(lifeExp))
colnames(leByContYear) <- c("continent", "year", "mean life expectancy")
leByContYear <- xtable(leByContYear)
print(leByContYear, type = "html", include.rownames = FALSE)
| continent | year | mean life expectancy |
|---|---|---|
| Africa | 1952 | 39.14 |
| Africa | 1957 | 41.27 |
| Africa | 1962 | 43.32 |
| Africa | 1967 | 45.33 |
| Africa | 1972 | 47.45 |
| Africa | 1977 | 49.58 |
| Africa | 1982 | 51.59 |
| Africa | 1987 | 53.34 |
| Africa | 1992 | 53.63 |
| Africa | 1997 | 53.60 |
| Africa | 2002 | 53.33 |
| Africa | 2007 | 54.81 |
| Americas | 1952 | 53.28 |
| Americas | 1957 | 55.96 |
| Americas | 1962 | 58.40 |
| Americas | 1967 | 60.41 |
| Americas | 1972 | 62.39 |
| Americas | 1977 | 64.39 |
| Americas | 1982 | 66.23 |
| Americas | 1987 | 68.09 |
| Americas | 1992 | 69.57 |
| Americas | 1997 | 71.15 |
| Americas | 2002 | 72.42 |
| Americas | 2007 | 73.61 |
| Asia | 1952 | 46.31 |
| Asia | 1957 | 49.32 |
| Asia | 1962 | 51.56 |
| Asia | 1967 | 54.66 |
| Asia | 1972 | 57.32 |
| Asia | 1977 | 59.61 |
| Asia | 1982 | 62.62 |
| Asia | 1987 | 64.85 |
| Asia | 1992 | 66.54 |
| Asia | 1997 | 68.02 |
| Asia | 2002 | 69.23 |
| Asia | 2007 | 70.73 |
| Europe | 1952 | 64.41 |
| Europe | 1957 | 66.70 |
| Europe | 1962 | 68.54 |
| Europe | 1967 | 69.74 |
| Europe | 1972 | 70.78 |
| Europe | 1977 | 71.94 |
| Europe | 1982 | 72.81 |
| Europe | 1987 | 73.64 |
| Europe | 1992 | 74.44 |
| Europe | 1997 | 75.51 |
| Europe | 2002 | 76.70 |
| Europe | 2007 | 77.65 |
| Oceania | 1952 | 69.25 |
| Oceania | 1957 | 70.30 |
| Oceania | 1962 | 71.09 |
| Oceania | 1967 | 71.31 |
| Oceania | 1972 | 71.91 |
| Oceania | 1977 | 72.85 |
| Oceania | 1982 | 74.29 |
| Oceania | 1987 | 75.32 |
| Oceania | 1992 | 76.94 |
| Oceania | 1997 | 78.19 |
| Oceania | 2002 | 79.74 |
| Oceania | 2007 | 80.72 |
This shows the number of countries in a given continent in a specific year who life expectancy is lower than our retirement age of 65. Africa and Asia stand out as having the most number of these countries.
leCounByContYear <- ddply(gDat, .(continent, year), summarize, countryCountOfLe = sum(lifeExp <
65))
colnames(leCounByContYear) <- c("continent", "year", "life expectancy")
leCounByContYear <- xtable(leCounByContYear)
print(leCounByContYear, type = "html", include.rownames = FALSE)
| continent | year | life expectancy |
|---|---|---|
| Africa | 1952 | 52 |
| Africa | 1957 | 52 |
| Africa | 1962 | 52 |
| Africa | 1967 | 52 |
| Africa | 1972 | 52 |
| Africa | 1977 | 51 |
| Africa | 1982 | 50 |
| Africa | 1987 | 47 |
| Africa | 1992 | 46 |
| Africa | 1997 | 45 |
| Africa | 2002 | 45 |
| Africa | 2007 | 43 |
| Americas | 1952 | 22 |
| Americas | 1957 | 21 |
| Americas | 1962 | 18 |
| Americas | 1967 | 16 |
| Americas | 1972 | 13 |
| Americas | 1977 | 11 |
| Americas | 1982 | 10 |
| Americas | 1987 | 7 |
| Americas | 1992 | 3 |
| Americas | 1997 | 2 |
| Americas | 2002 | 2 |
| Americas | 2007 | 1 |
| Asia | 1952 | 32 |
| Asia | 1957 | 31 |
| Asia | 1962 | 28 |
| Asia | 1967 | 28 |
| Asia | 1972 | 25 |
| Asia | 1977 | 22 |
| Asia | 1982 | 20 |
| Asia | 1987 | 13 |
| Asia | 1992 | 11 |
| Asia | 1997 | 10 |
| Asia | 2002 | 9 |
| Asia | 2007 | 8 |
| Europe | 1952 | 13 |
| Europe | 1957 | 8 |
| Europe | 1962 | 6 |
| Europe | 1967 | 2 |
| Europe | 1972 | 1 |
| Europe | 1977 | 1 |
| Europe | 1982 | 1 |
| Europe | 1987 | 1 |
| Europe | 1992 | 0 |
| Europe | 1997 | 0 |
| Europe | 2002 | 0 |
| Europe | 2007 | 0 |
| Oceania | 1952 | 0 |
| Oceania | 1957 | 0 |
| Oceania | 1962 | 0 |
| Oceania | 1967 | 0 |
| Oceania | 1972 | 0 |
| Oceania | 1977 | 0 |
| Oceania | 1982 | 0 |
| Oceania | 1987 | 0 |
| Oceania | 1992 | 0 |
| Oceania | 1997 | 0 |
| Oceania | 2002 | 0 |
| Oceania | 2007 | 0 |
Here, the data gives the maximum life expectancy in a continent in a specific year. It also includes that country that has this max life expectancy.
leByContYearCountry <- ddply(gDat, .(continent, year), summarize, maxLe = max(lifeExp),
country = country[which.max(lifeExp)])
colnames(leByContYearCountry) <- c("continent", "year", "max life expectancy",
"country")
leByContYearCountry <- xtable(leByContYearCountry)
print(leByContYearCountry, type = "html", include.rownames = FALSE)
| continent | year | max life expectancy | country |
|---|---|---|---|
| Africa | 1952 | 52.72 | Reunion |
| Africa | 1957 | 58.09 | Mauritius |
| Africa | 1962 | 60.25 | Mauritius |
| Africa | 1967 | 61.56 | Mauritius |
| Africa | 1972 | 64.27 | Reunion |
| Africa | 1977 | 67.06 | Reunion |
| Africa | 1982 | 69.89 | Reunion |
| Africa | 1987 | 71.91 | Reunion |
| Africa | 1992 | 73.61 | Reunion |
| Africa | 1997 | 74.77 | Reunion |
| Africa | 2002 | 75.74 | Reunion |
| Africa | 2007 | 76.44 | Reunion |
| Americas | 1952 | 68.75 | Canada |
| Americas | 1957 | 69.96 | Canada |
| Americas | 1962 | 71.30 | Canada |
| Americas | 1967 | 72.13 | Canada |
| Americas | 1972 | 72.88 | Canada |
| Americas | 1977 | 74.21 | Canada |
| Americas | 1982 | 75.76 | Canada |
| Americas | 1987 | 76.86 | Canada |
| Americas | 1992 | 77.95 | Canada |
| Americas | 1997 | 78.61 | Canada |
| Americas | 2002 | 79.77 | Canada |
| Americas | 2007 | 80.65 | Canada |
| Asia | 1952 | 65.39 | Israel |
| Asia | 1957 | 67.84 | Israel |
| Asia | 1962 | 69.39 | Israel |
| Asia | 1967 | 71.43 | Japan |
| Asia | 1972 | 73.42 | Japan |
| Asia | 1977 | 75.38 | Japan |
| Asia | 1982 | 77.11 | Japan |
| Asia | 1987 | 78.67 | Japan |
| Asia | 1992 | 79.36 | Japan |
| Asia | 1997 | 80.69 | Japan |
| Asia | 2002 | 82.00 | Japan |
| Asia | 2007 | 82.60 | Japan |
| Europe | 1952 | 72.67 | Norway |
| Europe | 1957 | 73.47 | Iceland |
| Europe | 1962 | 73.68 | Iceland |
| Europe | 1967 | 74.16 | Sweden |
| Europe | 1972 | 74.72 | Sweden |
| Europe | 1977 | 76.11 | Iceland |
| Europe | 1982 | 76.99 | Iceland |
| Europe | 1987 | 77.41 | Switzerland |
| Europe | 1992 | 78.77 | Iceland |
| Europe | 1997 | 79.39 | Sweden |
| Europe | 2002 | 80.62 | Switzerland |
| Europe | 2007 | 81.76 | Iceland |
| Oceania | 1952 | 69.39 | New Zealand |
| Oceania | 1957 | 70.33 | Australia |
| Oceania | 1962 | 71.24 | New Zealand |
| Oceania | 1967 | 71.52 | New Zealand |
| Oceania | 1972 | 71.93 | Australia |
| Oceania | 1977 | 73.49 | Australia |
| Oceania | 1982 | 74.74 | Australia |
| Oceania | 1987 | 76.32 | Australia |
| Oceania | 1992 | 77.56 | Australia |
| Oceania | 1997 | 78.83 | Australia |
| Oceania | 2002 | 80.37 | Australia |
| Oceania | 2007 | 81.23 | Australia |
Worked in collaboration with Jonathan Zhang