Summary:
Load libraries and data:
library(plyr)
library(lattice)
library(xtable)
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)
str(gDat)
## 'data.frame': 1704 obs. of 6 variables:
## $ country : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
## $ pop : num 8425333 9240934 10267083 11537966 13079460 ...
## $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ lifeExp : num 28.8 30.3 32 34 36.1 ...
## $ gdpPercap: num 779 821 853 836 740 ...
Html Print function:
htmlPrint <- function(x, ..., digits = 0, include.rownames = FALSE) {
print(xtable(x, digits = digits, ...), type = "html", include.rownames = include.rownames,
...)
}
Lets look at mean life expectancy on different continents. This is adapted from Jenny's Homework 3 example (Compute a trimmed mean of life expectancy for different years) with continent added to make it a bit more interesting.
meanLifePerYearCont <- ddply(gDat, ~year + continent, summarize, meanLifeExp = mean(lifeExp))
htmlPrint(meanLifePerYearCont)
| year | continent | meanLifeExp |
|---|---|---|
| 1952 | Africa | 39 |
| 1952 | Americas | 53 |
| 1952 | Asia | 46 |
| 1952 | Europe | 64 |
| 1952 | Oceania | 69 |
| 1957 | Africa | 41 |
| 1957 | Americas | 56 |
| 1957 | Asia | 49 |
| 1957 | Europe | 67 |
| 1957 | Oceania | 70 |
| 1962 | Africa | 43 |
| 1962 | Americas | 58 |
| 1962 | Asia | 52 |
| 1962 | Europe | 69 |
| 1962 | Oceania | 71 |
| 1967 | Africa | 45 |
| 1967 | Americas | 60 |
| 1967 | Asia | 55 |
| 1967 | Europe | 70 |
| 1967 | Oceania | 71 |
| 1972 | Africa | 47 |
| 1972 | Americas | 62 |
| 1972 | Asia | 57 |
| 1972 | Europe | 71 |
| 1972 | Oceania | 72 |
| 1977 | Africa | 50 |
| 1977 | Americas | 64 |
| 1977 | Asia | 60 |
| 1977 | Europe | 72 |
| 1977 | Oceania | 73 |
| 1982 | Africa | 52 |
| 1982 | Americas | 66 |
| 1982 | Asia | 63 |
| 1982 | Europe | 73 |
| 1982 | Oceania | 74 |
| 1987 | Africa | 53 |
| 1987 | Americas | 68 |
| 1987 | Asia | 65 |
| 1987 | Europe | 74 |
| 1987 | Oceania | 75 |
| 1992 | Africa | 54 |
| 1992 | Americas | 70 |
| 1992 | Asia | 67 |
| 1992 | Europe | 74 |
| 1992 | Oceania | 77 |
| 1997 | Africa | 54 |
| 1997 | Americas | 71 |
| 1997 | Asia | 68 |
| 1997 | Europe | 76 |
| 1997 | Oceania | 78 |
| 2002 | Africa | 53 |
| 2002 | Americas | 72 |
| 2002 | Asia | 69 |
| 2002 | Europe | 77 |
| 2002 | Oceania | 80 |
| 2007 | Africa | 55 |
| 2007 | Americas | 74 |
| 2007 | Asia | 71 |
| 2007 | Europe | 78 |
| 2007 | Oceania | 81 |
xyplot(meanLifeExp ~ year | continent, meanLifePerYearCont, type = c("p", "r"))
Aside from the african continent there is a clear trend of increasing life expectancy throughout. Even Africa seems to be on track by the looks of it as seen by the last data point to having higher life expectancy.
I'm using Jenny's example from homework 3 (Get the maximum and minimum of GDP per capita for all continents in a “tall” format).
For this next part lets remove Oceania from the dataset:
gDat <- droplevels(subset(gDat, continent != "Oceania"))
summary(gDat$continent)
## Africa Americas Asia Europe
## 624 300 396 360
Next lets apply Jenny's code with year and country added for so we have more to talk about:
gdpMaxMin <- ddply(gDat, ~continent, function(x) {
gdpPercap <- range(x$gdpPercap)
year <- x[x$gdpPercap == gdpPercap, ]$year
return(data.frame(year, gdpPercap, stat = c("min", "max")))
})
htmlPrint(gdpMaxMin)
| continent | year | gdpPercap | stat |
|---|---|---|---|
| Africa | 2002 | 241 | min |
| Africa | 1977 | 21951 | max |
| Americas | 2007 | 1202 | min |
| Americas | 2007 | 42952 | max |
| Asia | 1957 | 331 | min |
| Asia | 1952 | 113523 | max |
| Europe | 1952 | 974 | min |
| Europe | 2007 | 49357 | max |
Now to apply the actual visualiztion:
stripplot(gdpPercap ~ continent, gdpMaxMin, groups = year, auto.key = TRUE)
Each point is a year as well as a location. It is striking to see how rich Asia's richest country in 1952 even compared to today's (2007) riches countries standards. Quite striking also is how some of the more recent dates (2007, 2002) for Africa and America's are also the poorest since 1952.