Daniel Dinsdale
First of all we must import the Gapminder data as follows:
gDat <- read.delim("gapminderDataFiveYear.txt")
Now the data has been imported we can calculate the number of observations in the dataset as follows:
nrow(gDat)
## [1] 1704
This tells us we have 1704 observations.
We can also reveal the column and variable names:
colnames(gDat)
## [1] "country" "year" "pop" "continent" "lifeExp" "gdpPercap"
summary(gDat)
## country year pop continent
## Afghanistan: 12 Min. :1952 Min. :6.00e+04 Africa :624
## Albania : 12 1st Qu.:1966 1st Qu.:2.79e+06 Americas:300
## Algeria : 12 Median :1980 Median :7.02e+06 Asia :396
## Angola : 12 Mean :1980 Mean :2.96e+07 Europe :360
## Argentina : 12 3rd Qu.:1993 3rd Qu.:1.96e+07 Oceania : 24
## Australia : 12 Max. :2007 Max. :1.32e+09
## (Other) :1632
## lifeExp gdpPercap
## Min. :23.6 Min. : 241
## 1st Qu.:48.2 1st Qu.: 1202
## Median :60.7 Median : 3532
## Mean :59.5 Mean : 7215
## 3rd Qu.:70.8 3rd Qu.: 9325
## Max. :82.6 Max. :113523
##
It is also possible to view the data in graphical form, such as this density plot of life expectancy in Canada:
library(lattice)
densityplot(~lifeExp, gDat, subset = country == "Canada")