STAT 545A Homework 5

Jinyuan Zhang


2013-10-06

Data import

The dataset gapminderDataFiveYear.txt is used in this analysis.

gdURL = "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)

Drop Oceania from the original data.

iDat = droplevels(subset(gDat, continent != "Oceania"))
table(iDat$continent)
## 
##   Africa Americas     Asia   Europe 
##      624      300      396      360

How is gdpPerCap changing over time on different continents?

Boxplot

Using library “lattice”“

bwplot(gdpPercap ~ as.factor(year) | continent, iDat, group = continent, auto.key = list(columns = nlevels(iDat$continent)))

plot of chunk unnamed-chunk-3

Using library "ggplot2”

ggplot(data = iDat, aes(x = factor(year), gdpPercap)) + geom_boxplot(aes(fill = continent), 
    width = 0.5) + facet_wrap(~continent)

plot of chunk unnamed-chunk-4

Circle Plot

Import Data with color factor

gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderWithColorsAndSorted.txt"
kDat <- read.delim(file = gdURL, as.is = 7)
jCexDivisor <- 1500  # arbitrary scaling constant
jPch <- 21
jDarkGray <- "grey20"
iYear = c(1957, 1977, 2002, 2007)
yDat = subset(kDat, year %in% iYear)

gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderContinentColors.txt"
continentColors <- read.delim(file = gdURL, as.is = 3)  # protect color
continentKey <- with(continentColors, list(x = 0.95, y = 0.05, corner = c(1, 
    0), text = list(as.character(continent)), points = list(pch = jPch, col = jDarkGray, 
    fill = color)))

Using lattice

xyplot(lifeExp ~ gdpPercap | factor(year), yDat, aspect = 2/3, grid = TRUE, 
    scales = list(x = list(log = 10, equispaced.log = FALSE)), cex = sqrt(yDat$pop/pi)/jCexDivisor, 
    fill.color = yDat$color, col = jDarkGray, key = continentKey, panel = function(x, 
        y, ..., cex, fill.color, subscripts) {
        panel.xyplot(x, y, cex = cex[subscripts], pch = jPch, fill = fill.color[subscripts], 
            ...)
    })

plot of chunk unnamed-chunk-6

Using ggplot2

ggplot(data = yDat, aes(y = lifeExp, x = gdpPercap, colour = continent, size = sqrt(1/pi))) + 
    geom_point() + facet_wrap(~year) + scale_x_log10() + aes(shape = continent)

plot of chunk unnamed-chunk-7

Look at the spread of GDP per capita within the continents in year 2007.

Voilin Plot

Using library “lattice”“

hDat = subset(iDat, year == 2007)
bwplot(gdpPercap ~ continent, hDat, panel = function(..., box.ratio) {
    panel.violin(..., col = "transparent", border = "grey60", varwidth = FALSE, 
        box.ratio = box.ratio)
    panel.bwplot(..., fill = NULL, box.ratio = 0.1)
})

plot of chunk unnamed-chunk-8

Using ggplot2

ggplot(data = hDat, aes(x = continent, gdpPercap)) + geom_boxplot(colour = I("#3366FF"), 
    outlier.size = 0.1, alpha = 1/10) + geom_violin(trim = TRUE, adjust = 1, 
    aes(fill = continent))

plot of chunk unnamed-chunk-9

Density Plot

Using Lattice

densityplot(~gdpPercap, hDat, plot.points = FALSE, ref = TRUE, group = continent, 
    auto.key = list(columns = nlevels(hDat$continent)))

plot of chunk unnamed-chunk-10

Using ggplot2

ggplot(data = hDat, aes(x = gdpPercap, group = continent, colour = continent)) + 
    geom_density()

plot of chunk unnamed-chunk-11

Conclusion: It is easier to use ggplot2 to combine several graphs, however, lattice usually provides more beautiful graph because it draw a picture as a whole.