In this report, I will be taking a look at the Gapminder data set provided by Jenny for our STAT 545A (2013W) class.
The purpose of this report is to practice using R Markdown to generate reports. I will take a look at the Gapminder data set and practice using the ggplot2 package to generate plots.
We begin by reading in the tab delimited data set using the read.delim function in R.
gDat <- read.delim(file = "gapminderDataFiveYear.txt")
It is always a good idea to check the structure of the data using the str function.
str(gDat)
## 'data.frame': 1704 obs. of 6 variables:
## $ country : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
## $ pop : num 8425333 9240934 10267083 11537966 13079460 ...
## $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ lifeExp : num 28.8 30.3 32 34 36.1 ...
## $ gdpPercap: num 779 821 853 836 740 ...
We can also use the summary function to provide a summary of a given R object. In our case, since we input a data.frame object, we recieve some summary statistics of the variables inside our data set.
summary(gDat)
## country year pop continent
## Afghanistan: 12 Min. :1952 Min. :6.00e+04 Africa :624
## Albania : 12 1st Qu.:1966 1st Qu.:2.79e+06 Americas:300
## Algeria : 12 Median :1980 Median :7.02e+06 Asia :396
## Angola : 12 Mean :1980 Mean :2.96e+07 Europe :360
## Argentina : 12 3rd Qu.:1993 3rd Qu.:1.96e+07 Oceania : 24
## Australia : 12 Max. :2007 Max. :1.32e+09
## (Other) :1632
## lifeExp gdpPercap
## Min. :23.6 Min. : 241
## 1st Qu.:48.2 1st Qu.: 1202
## Median :60.7 Median : 3532
## Mean :59.5 Mean : 7215
## 3rd Qu.:70.8 3rd Qu.: 9325
## Max. :82.6 Max. :113523
##
In an attempt to learn how to make plots with the ggplot2 package, I try to recreate the first 6 of Jenny's lattice plots from the lecture 2 “Work through” exercise (Basic care and feeding of data in R).
For the code used to generate this report, click here