STAT 545A Homework#2

Yiming Zhang

In this assignment, we will explore some basic features of Gapminder data.

First, loading the Gapminder data as well as the needed libraries (for this case, we will only need the lattice library).

gDat <- read.delim("gapminderDataFiveYear.txt")
library(lattice)

Now the data has been successfully imported. Let's get some basic information from it.

str(gDat)
## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...

We can see there are 1704 observations and 6 variables.

And we aslo can have a more statistical overview of the data.

str(gDat)
## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...

Now we want to focus on one country, China, for example. We want to know the life expectation in China from 1952 to 2007. It's possible to view the data in graphical form.

xyplot(lifeExp ~ year, gDat, subset = country == "China", type = c("p", "r"))

plot of chunk unnamed-chunk-4

We can see that in 1962, there is a big drop in life expectation.