Yiming Zhang
First, loading the Gapminder data and needed packages.
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)
library(lattice)
library(plyr)
library(xtable)
library(ggplot2)
Also drop the Oceania
gDat <- droplevels(subset(gDat, continent != "Oceania"))
In this case, I use lifeExp as the quantitative variable and year as categorical variable. For all continents, we have
ggplot(gDat, aes(x = factor(year), y = lifeExp, colour = year)) + geom_boxplot(aes(fill = factor(year))) +
geom_point()
Then we can show that in separate continents
ggplot(gDat, aes(x = factor(year), y = lifeExp, colour = year)) + geom_boxplot(aes(fill = factor(year))) +
geom_point() + facet_grid(~continent)
Notice that the color has made the plot messy, so I let it be plain.
ggplot(gDat, aes(x = factor(year), y = lifeExp, colour = year)) + geom_boxplot() +
geom_point() + facet_grid(~continent)
That's better.
Use two quantitative variables, lifeExp and GdpPercap. Also add facets as continent.
ggplot(subset(gDat, year == 2002), aes(x = gdpPercap, y = lifeExp, colour = continent)) +
geom_point()
The plot looks like expotiential, so let's add scale in it. And I also add size as population to each point.
ggplot(subset(gDat, year == 2002), aes(x = gdpPercap, y = lifeExp, colour = continent,
size = sqrt(pop))) + geom_point() + scale_x_log10()
First plot the maximum of GDP per capital for all continents with lattice
GDPbyYear_tall <- ddply(gDat, ~year + continent, summarize, Max = max(gdpPercap))
xyplot(Max ~ year, GDPbyYear_tall, groups = continent, auto.key = TRUE, type = c("p",
"a"))
Then do the same work with ggplot2
ggplot(GDPbyYear_tall, aes(x = year, y = Max, colour = continent)) + geom_point() +
geom_line()
We can see the plot in ggplot is better.