Assignment 2

Instructions

copied from http://www.stat.ubc.ca/~jenny/STAT545A/hw02_rmarkdownGapminder.html

Determine and report basic facts like the number of observations and which variables are there. Make at least one figure. Report some very basic descriptive statistics, such as results from summary().

Examples

These are things I'm trying to do with the Gap Minder dataset:

Look at the data:

gDat = read.table("gapminderDataFiveYear.txt", sep = "\t", quote = "\"", header = TRUE)
peek <- function(data, size = 6) {
    randRows <- runif(size, min = 1, max = nrow(data))
    sampleDat <- data[sort(randRows), ]
    return(sampleDat)
}

peek(gDat)
##         country year      pop continent lifeExp gdpPercap
## 90      Bahrain 1977   297410      Asia   65.59   19340.1
## 100  Bangladesh 1967 62821884      Asia   43.45     721.2
## 1182     Panama 1977  1839782  Americas   68.68    5351.9
## 1191   Paraguay 1962  2009813  Americas   64.36    2148.0
## 1269    Reunion 1992   622191    Africa   73.61    6101.3
## 1695   Zimbabwe 1962  4277736    Africa   52.36     527.3
tail(gDat)
##       country year      pop continent lifeExp gdpPercap
## 1699 Zimbabwe 1982  7636524    Africa   60.36     788.9
## 1700 Zimbabwe 1987  9216418    Africa   62.35     706.2
## 1701 Zimbabwe 1992 10704340    Africa   60.38     693.4
## 1702 Zimbabwe 1997 11404948    Africa   46.81     792.4
## 1703 Zimbabwe 2002 11926563    Africa   39.99     672.0
## 1704 Zimbabwe 2007 12311143    Africa   43.49     469.7

Basic facts about the data:

str(gDat)
## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...
summary(gDat)
##         country          year           pop              continent  
##  Afghanistan:  12   Min.   :1952   Min.   :6.00e+04   Africa  :624  
##  Albania    :  12   1st Qu.:1966   1st Qu.:2.79e+06   Americas:300  
##  Algeria    :  12   Median :1980   Median :7.02e+06   Asia    :396  
##  Angola     :  12   Mean   :1980   Mean   :2.96e+07   Europe  :360  
##  Argentina  :  12   3rd Qu.:1993   3rd Qu.:1.96e+07   Oceania : 24  
##  Australia  :  12   Max.   :2007   Max.   :1.32e+09                 
##  (Other)    :1632                                                   
##     lifeExp       gdpPercap     
##  Min.   :23.6   Min.   :   241  
##  1st Qu.:48.2   1st Qu.:  1202  
##  Median :60.7   Median :  3532  
##  Mean   :59.5   Mean   :  7215  
##  3rd Qu.:70.8   3rd Qu.:  9325  
##  Max.   :82.6   Max.   :113523  
## 
colnames(gDat)
## [1] "country"   "year"      "pop"       "continent" "lifeExp"   "gdpPercap"
dim(gDat)
## [1] 1704    6
nrow(gDat)
## [1] 1704

Difference in life expectancy across continents in year 2007:

library(lattice)
bwplot(~lifeExp | continent, data = gDat, subset = year == 2007, layout = c(1, 
    5))

plot of chunk unnamed-chunk-4

The trend in life expectancy vs GDP per Capita, and how countries across different continents fares as in 2007:

xyplot(lifeExp ~ gdpPercap, data = gDat, subset = year == 2007, group = continent, 
    auto.key = T)

plot of chunk unnamed-chunk-5