STAT545 Homework3

Fist, let's import the data.

gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)

Let's study the 2 variables, continent and GDP per capita.

The first thing we want to know is the minimum and the maximum GDP per capita of each continent.

minandmaxgdpByCont <- ddply(gDat, ~continent, summarize, mingdpPercap = min(gdpPercap), 
    maxgdpPercap = max(gdpPercap))

Let's sort the minimum GDP per capital.

minandmaxgdpByCont <- arrange(minandmaxgdpByCont, mingdpPercap)

And the table is like this:

minandmaxgdpByCont <- xtable(minandmaxgdpByCont)
print(minandmaxgdpByCont, type = "html", include.rownames = TRUE)
continent mingdpPercap maxgdpPercap
1 Africa 241.17 21951.21
2 Asia 331.00 113523.13
3 Europe 973.53 49357.19
4 Americas 1201.64 42951.65
5 Oceania 10039.60 34435.37

Then, let's sort the maximum GDP per capital.

minandmaxgdpByCont <- arrange(minandmaxgdpByCont, maxgdpPercap)

And the table is:

print(minandmaxgdpByCont, type = "html", include.rownames = TRUE)
continent mingdpPercap maxgdpPercap
1 Africa 241.17 21951.21
2 Oceania 10039.60 34435.37
3 Americas 1201.64 42951.65
4 Europe 973.53 49357.19
5 Asia 331.00 113523.13

From these two tables, we can see that the rankings of continents are quite different, especially for Asia. Asia has the very low minimum GDP per capital but a very high maximum GDP per captial. One reason may be some countries in Asia are undeveloped, while some are highly developed.

The next thing we want to know is the spread of GDP per capita within the continents.

spreadgdpByCont <- ddply(gDat, ~continent, summarize, SDgdpPercap = sd(gdpPercap), 
    VARgdpPercap = var(gdpPercap), MEgdpPercap = mad(gdpPercap), IQRgdpPercap = IQR(gdpPercap))

And let's sort the variance of GDP per capita.

spreadgdpByCont <- arrange(spreadgdpByCont, VARgdpPercap)

And the table is:

spreadgdpByCont <- xtable(spreadgdpByCont)
print(spreadgdpByCont, type = "html", include.rownames = TRUE)
continent SDgdpPercap VARgdpPercap MEgdpPercap IQRgdpPercap
1 Africa 2827.93 7997187.31 775.32 1616.17
2 Oceania 6358.98 40436668.87 6459.10 8072.26
3 Americas 6396.76 40918591.10 3269.33 4402.43
4 Europe 9355.21 87520019.60 8846.05 13248.30
5 Asia 14045.37 197272505.85 2820.83 7492.26

And let's sort the median of GDP per capita.

spreadgdpByCont <- arrange(spreadgdpByCont, MEgdpPercap)

And the table is:

print(spreadgdpByCont, type = "html", include.rownames = TRUE)
continent SDgdpPercap VARgdpPercap MEgdpPercap IQRgdpPercap
1 Africa 2827.93 7997187.31 775.32 1616.17
2 Asia 14045.37 197272505.85 2820.83 7492.26
3 Americas 6396.76 40918591.10 3269.33 4402.43
4 Oceania 6358.98 40436668.87 6459.10 8072.26
5 Europe 9355.21 87520019.60 8846.05 13248.30

The rankings of continents in these two tables are also very different. And we are not surprised to see Asia has a very high variance of GDP per capita but a very low median of GDP per capita.