Daniel Dinsdale

Homework 2-STAT 545A.

Plan:

Importing Data:

First of all we must import the Gapminder data as follows:

gDat <- read.delim("gapminderDataFiveYear.txt")

Data set information:

Now the data has been imported we can calculate the number of observations in the dataset as follows:

nrow(gDat)
## [1] 1704

This tells us we have 1704 observations.

We can also reveal the column and variable names:

colnames(gDat)
## [1] "country"   "year"      "pop"       "continent" "lifeExp"   "gdpPercap"
summary(gDat)
##         country          year           pop              continent  
##  Afghanistan:  12   Min.   :1952   Min.   :6.00e+04   Africa  :624  
##  Albania    :  12   1st Qu.:1966   1st Qu.:2.79e+06   Americas:300  
##  Algeria    :  12   Median :1980   Median :7.02e+06   Asia    :396  
##  Angola     :  12   Mean   :1980   Mean   :2.96e+07   Europe  :360  
##  Argentina  :  12   3rd Qu.:1993   3rd Qu.:1.96e+07   Oceania : 24  
##  Australia  :  12   Max.   :2007   Max.   :1.32e+09                 
##  (Other)    :1632                                                   
##     lifeExp       gdpPercap     
##  Min.   :23.6   Min.   :   241  
##  1st Qu.:48.2   1st Qu.:  1202  
##  Median :60.7   Median :  3532  
##  Mean   :59.5   Mean   :  7215  
##  3rd Qu.:70.8   3rd Qu.:  9325  
##  Max.   :82.6   Max.   :113523  
## 

Graphical insight into data:

It is also possible to view the data in graphical form, such as this density plot of life expectancy in Canada:

library(lattice)
densityplot(~lifeExp, gDat, subset = country == "Canada")

plot of chunk unnamed-chunk-4