STAT 545A Homework 2, Xinxin Xue
First, import data, and see what information is given
library(lattice)
gap <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
life <- read.delim(file = gap)
head(life)
## country year pop continent lifeExp gdpPercap
## 1 Afghanistan 1952 8425333 Asia 28.80 779.4
## 2 Afghanistan 1957 9240934 Asia 30.33 820.9
## 3 Afghanistan 1962 10267083 Asia 32.00 853.1
## 4 Afghanistan 1967 11537966 Asia 34.02 836.2
## 5 Afghanistan 1972 13079460 Asia 36.09 740.0
## 6 Afghanistan 1977 14880372 Asia 38.44 786.1
Then, curious about basic info in the dataset
unique(life$country)
## [1] Afghanistan Albania
## [3] Algeria Angola
## [5] Argentina Australia
## [7] Austria Bahrain
## [9] Bangladesh Belgium
## [11] Benin Bolivia
## [13] Bosnia and Herzegovina Botswana
## [15] Brazil Bulgaria
## [17] Burkina Faso Burundi
## [19] Cambodia Cameroon
## [21] Canada Central African Republic
## [23] Chad Chile
## [25] China Colombia
## [27] Comoros Congo, Dem. Rep.
## [29] Congo, Rep. Costa Rica
## [31] Cote d'Ivoire Croatia
## [33] Cuba Czech Republic
## [35] Denmark Djibouti
## [37] Dominican Republic Ecuador
## [39] Egypt El Salvador
## [41] Equatorial Guinea Eritrea
## [43] Ethiopia Finland
## [45] France Gabon
## [47] Gambia Germany
## [49] Ghana Greece
## [51] Guatemala Guinea
## [53] Guinea-Bissau Haiti
## [55] Honduras Hong Kong, China
## [57] Hungary Iceland
## [59] India Indonesia
## [61] Iran Iraq
## [63] Ireland Israel
## [65] Italy Jamaica
## [67] Japan Jordan
## [69] Kenya Korea, Dem. Rep.
## [71] Korea, Rep. Kuwait
## [73] Lebanon Lesotho
## [75] Liberia Libya
## [77] Madagascar Malawi
## [79] Malaysia Mali
## [81] Mauritania Mauritius
## [83] Mexico Mongolia
## [85] Montenegro Morocco
## [87] Mozambique Myanmar
## [89] Namibia Nepal
## [91] Netherlands New Zealand
## [93] Nicaragua Niger
## [95] Nigeria Norway
## [97] Oman Pakistan
## [99] Panama Paraguay
## [101] Peru Philippines
## [103] Poland Portugal
## [105] Puerto Rico Reunion
## [107] Romania Rwanda
## [109] Sao Tome and Principe Saudi Arabia
## [111] Senegal Serbia
## [113] Sierra Leone Singapore
## [115] Slovak Republic Slovenia
## [117] Somalia South Africa
## [119] Spain Sri Lanka
## [121] Sudan Swaziland
## [123] Sweden Switzerland
## [125] Syria Taiwan
## [127] Tanzania Thailand
## [129] Togo Trinidad and Tobago
## [131] Tunisia Turkey
## [133] Uganda United Kingdom
## [135] United States Uruguay
## [137] Venezuela Vietnam
## [139] West Bank and Gaza Yemen, Rep.
## [141] Zambia Zimbabwe
## 142 Levels: Afghanistan Albania Algeria Angola Argentina ... Zimbabwe
unique(life$continent)
## [1] Asia Europe Africa Americas Oceania
## Levels: Africa Americas Asia Europe Oceania
unique(life$year)
## [1] 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007
Now, look for some pattern
xyplot(lifeExp ~ pop | continent, life)
xyplot(lifeExp ~ year | continent, life)
Whatis the outlier in Africa, and is the highest life expectancy in a Scandianavian country? What's the global statistics?
subset(life, life$lifeExp == min(life$lifeExp))
## country year pop continent lifeExp gdpPercap
## 1293 Rwanda 1992 7290203 Africa 23.6 737.1
subset(life, life$lifeExp == max(life$lifeExp))
## country year pop continent lifeExp gdpPercap
## 804 Japan 2007 127467972 Asia 82.6 31656
summary(life)
## country year pop continent
## Afghanistan: 12 Min. :1952 Min. :6.00e+04 Africa :624
## Albania : 12 1st Qu.:1966 1st Qu.:2.79e+06 Americas:300
## Algeria : 12 Median :1980 Median :7.02e+06 Asia :396
## Angola : 12 Mean :1980 Mean :2.96e+07 Europe :360
## Argentina : 12 3rd Qu.:1993 3rd Qu.:1.96e+07 Oceania : 24
## Australia : 12 Max. :2007 Max. :1.32e+09
## (Other) :1632
## lifeExp gdpPercap
## Min. :23.6 Min. : 241
## 1st Qu.:48.2 1st Qu.: 1202
## Median :60.7 Median : 3532
## Mean :59.5 Mean : 7215
## 3rd Qu.:70.8 3rd Qu.: 9325
## Max. :82.6 Max. :113523
##
Asia has interesting pattern
xyplot(lifeExp ~ year | country, subset(life, continent == "Asia"))
Drops in life expectancy in Cambodia and China during the 1960s, due to Khmer Rouge genocide and the great leap (caused famine), respectively.