STAT545a - HW #3

Mina Park

In this exercise, we are working with the plyr package which allows us to split, apply, and combine data in R. We are using data from the Gapminder project.

1.) Load libraries, data, and perform a superficial check of data import

library(plyr)

Dat <- read.delim("gapminderDataFiveYear.txt")
str(Dat)
## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...
summary(Dat)
##         country          year           pop              continent  
##  Afghanistan:  12   Min.   :1952   Min.   :6.00e+04   Africa  :624  
##  Albania    :  12   1st Qu.:1966   1st Qu.:2.79e+06   Americas:300  
##  Algeria    :  12   Median :1980   Median :7.02e+06   Asia    :396  
##  Angola     :  12   Mean   :1980   Mean   :2.96e+07   Europe  :360  
##  Argentina  :  12   3rd Qu.:1993   3rd Qu.:1.96e+07   Oceania : 24  
##  Australia  :  12   Max.   :2007   Max.   :1.32e+09                 
##  (Other)    :1632                                                   
##     lifeExp       gdpPercap     
##  Min.   :23.6   Min.   :   241  
##  1st Qu.:48.2   1st Qu.:  1202  
##  Median :60.7   Median :  3532  
##  Mean   :59.5   Mean   :  7215  
##  3rd Qu.:70.8   3rd Qu.:  9325  
##  Max.   :82.6   Max.   :113523  
## 
head(Dat, 5)
##       country year      pop continent lifeExp gdpPercap
## 1 Afghanistan 1952  8425333      Asia   28.80     779.4
## 2 Afghanistan 1957  9240934      Asia   30.33     820.9
## 3 Afghanistan 1962 10267083      Asia   32.00     853.1
## 4 Afghanistan 1967 11537966      Asia   34.02     836.2
## 5 Afghanistan 1972 13079460      Asia   36.09     740.0

We notice that the variables in the data are: country, year, pop, continent, lifeExp, gdpPercap. We also notice that the data is in a data frame.

2.) We want to investigate life expectancy and GDP over time, by continent

AvgLifeExpAndGdpByYearAndCont <- ddply(Dat, ~year + continent, summarize, meanLifeExp = mean(lifeExp), 
    meanGdp = mean(gdpPercap))
AvgLifeExpAndGdpByYearAndCont
##    year continent meanLifeExp meanGdp
## 1  1952    Africa       39.14    1253
## 2  1952  Americas       53.28    4079
## 3  1952      Asia       46.31    5195
## 4  1952    Europe       64.41    5661
## 5  1952   Oceania       69.25   10298
## 6  1957    Africa       41.27    1385
## 7  1957  Americas       55.96    4616
## 8  1957      Asia       49.32    5788
## 9  1957    Europe       66.70    6963
## 10 1957   Oceania       70.30   11599
## 11 1962    Africa       43.32    1598
## 12 1962  Americas       58.40    4902
## 13 1962      Asia       51.56    5729
## 14 1962    Europe       68.54    8365
## 15 1962   Oceania       71.09   12696
## 16 1967    Africa       45.33    2050
## 17 1967  Americas       60.41    5668
## 18 1967      Asia       54.66    5971
## 19 1967    Europe       69.74   10144
## 20 1967   Oceania       71.31   14495
## 21 1972    Africa       47.45    2340
## 22 1972  Americas       62.39    6491
## 23 1972      Asia       57.32    8187
## 24 1972    Europe       70.78   12480
## 25 1972   Oceania       71.91   16417
## 26 1977    Africa       49.58    2586
## 27 1977  Americas       64.39    7352
## 28 1977      Asia       59.61    7791
## 29 1977    Europe       71.94   14284
## 30 1977   Oceania       72.85   17284
## 31 1982    Africa       51.59    2482
## 32 1982  Americas       66.23    7507
## 33 1982      Asia       62.62    7434
## 34 1982    Europe       72.81   15618
## 35 1982   Oceania       74.29   18555
## 36 1987    Africa       53.34    2283
## 37 1987  Americas       68.09    7793
## 38 1987      Asia       64.85    7608
## 39 1987    Europe       73.64   17214
## 40 1987   Oceania       75.32   20448
## 41 1992    Africa       53.63    2282
## 42 1992  Americas       69.57    8045
## 43 1992      Asia       66.54    8640
## 44 1992    Europe       74.44   17062
## 45 1992   Oceania       76.94   20894
## 46 1997    Africa       53.60    2379
## 47 1997  Americas       71.15    8889
## 48 1997      Asia       68.02    9834
## 49 1997    Europe       75.51   19077
## 50 1997   Oceania       78.19   24024
## 51 2002    Africa       53.33    2599
## 52 2002  Americas       72.42    9288
## 53 2002      Asia       69.23   10174
## 54 2002    Europe       76.70   21712
## 55 2002   Oceania       79.74   26939
## 56 2007    Africa       54.81    3089
## 57 2007  Americas       73.61   11003
## 58 2007      Asia       70.73   12473
## 59 2007    Europe       77.65   25054
## 60 2007   Oceania       80.72   29810

This gives us the data we are looking for, namely life expectancy and GDP over time per continent. But to see trends within continents, we want to see the data presented by continent.

arrange(AvgLifeExpAndGdpByYearAndCont, continent)
##    year continent meanLifeExp meanGdp
## 1  1952    Africa       39.14    1253
## 2  1957    Africa       41.27    1385
## 3  1962    Africa       43.32    1598
## 4  1967    Africa       45.33    2050
## 5  1972    Africa       47.45    2340
## 6  1977    Africa       49.58    2586
## 7  1982    Africa       51.59    2482
## 8  1987    Africa       53.34    2283
## 9  1992    Africa       53.63    2282
## 10 1997    Africa       53.60    2379
## 11 2002    Africa       53.33    2599
## 12 2007    Africa       54.81    3089
## 13 1952  Americas       53.28    4079
## 14 1957  Americas       55.96    4616
## 15 1962  Americas       58.40    4902
## 16 1967  Americas       60.41    5668
## 17 1972  Americas       62.39    6491
## 18 1977  Americas       64.39    7352
## 19 1982  Americas       66.23    7507
## 20 1987  Americas       68.09    7793
## 21 1992  Americas       69.57    8045
## 22 1997  Americas       71.15    8889
## 23 2002  Americas       72.42    9288
## 24 2007  Americas       73.61   11003
## 25 1952      Asia       46.31    5195
## 26 1957      Asia       49.32    5788
## 27 1962      Asia       51.56    5729
## 28 1967      Asia       54.66    5971
## 29 1972      Asia       57.32    8187
## 30 1977      Asia       59.61    7791
## 31 1982      Asia       62.62    7434
## 32 1987      Asia       64.85    7608
## 33 1992      Asia       66.54    8640
## 34 1997      Asia       68.02    9834
## 35 2002      Asia       69.23   10174
## 36 2007      Asia       70.73   12473
## 37 1952    Europe       64.41    5661
## 38 1957    Europe       66.70    6963
## 39 1962    Europe       68.54    8365
## 40 1967    Europe       69.74   10144
## 41 1972    Europe       70.78   12480
## 42 1977    Europe       71.94   14284
## 43 1982    Europe       72.81   15618
## 44 1987    Europe       73.64   17214
## 45 1992    Europe       74.44   17062
## 46 1997    Europe       75.51   19077
## 47 2002    Europe       76.70   21712
## 48 2007    Europe       77.65   25054
## 49 1952   Oceania       69.25   10298
## 50 1957   Oceania       70.30   11599
## 51 1962   Oceania       71.09   12696
## 52 1967   Oceania       71.31   14495
## 53 1972   Oceania       71.91   16417
## 54 1977   Oceania       72.85   17284
## 55 1982   Oceania       74.29   18555
## 56 1987   Oceania       75.32   20448
## 57 1992   Oceania       76.94   20894
## 58 1997   Oceania       78.19   24024
## 59 2002   Oceania       79.74   26939
## 60 2007   Oceania       80.72   29810

Using “arrange” gives us data arranged by continent. Note: we can also get data arranged in this order by putting the variables in the order of “~continent+year” when we use ddply().

In general, it looks like life expectancy is increasing with GDP over time for all continents. A figure would be a great way of visualizing this trend.

3.) We want to look at life expectancy and GDP over time, by continent, in a “wide” format

LifeExpByYearAndCont.Wide <- daply(Dat, ~year + continent, summarize, avgLifeExp = mean(lifeExp))
LifeExpByYearAndCont.Wide
##       continent
## year   Africa Americas Asia  Europe Oceania
##   1952 39.14  53.28    46.31 64.41  69.25  
##   1957 41.27  55.96    49.32 66.7   70.3   
##   1962 43.32  58.4     51.56 68.54  71.09  
##   1967 45.33  60.41    54.66 69.74  71.31  
##   1972 47.45  62.39    57.32 70.78  71.91  
##   1977 49.58  64.39    59.61 71.94  72.85  
##   1982 51.59  66.23    62.62 72.81  74.29  
##   1987 53.34  68.09    64.85 73.64  75.32  
##   1992 53.63  69.57    66.54 74.44  76.94  
##   1997 53.6   71.15    68.02 75.51  78.19  
##   2002 53.33  72.42    69.23 76.7   79.74  
##   2007 54.81  73.61    70.73 77.65  80.72
GdpByYearAndCont.Wide <- daply(Dat, ~year + continent, summarize, avgGdp = mean(gdpPercap))
GdpByYearAndCont.Wide
##       continent
## year   Africa Americas Asia  Europe Oceania
##   1952 1253   4079     5195  5661   10298  
##   1957 1385   4616     5788  6963   11599  
##   1962 1598   4902     5729  8365   12696  
##   1967 2050   5668     5971  10144  14495  
##   1972 2340   6491     8187  12480  16417  
##   1977 2586   7352     7791  14284  17284  
##   1982 2482   7507     7434  15618  18555  
##   1987 2283   7793     7608  17214  20448  
##   1992 2282   8045     8640  17062  20894  
##   1997 2379   8889     9834  19077  24024  
##   2002 2599   9288     10174 21712  26939  
##   2007 3089   11003    12473 25054  29810

Presenting data in a wide format allows for a more compact table and it is easier to look at trends across continents. As the name implies, it is “wide”, whereas the previous data was presented in a “tall” format. Note: This is an example of another output of the plyr package, namely in an array format.

4.) We want to look at the number and proportion of countries with low life expectancy over time, by continent

4.1) Before doing this for the entire dataset, I want to initially try doing some exercises just for the year 2007.

Dat2007 <- subset(Dat, year == 2007)

avgLifeExp2007 <- mean(Dat2007$lifeExp)
avgLifeExp2007  #67
## [1] 67.01

We will use life expectancy below the mean life expectancy of 67 as the threshold for low life expectancy.

Dat2007LowLifeExp <- ddply(Dat2007, ~continent + country, summarize, lowLifeExp = lifeExp < 
    67)
Dat2007LowLifeExp
##     continent                  country lowLifeExp
## 1      Africa                  Algeria      FALSE
## 2      Africa                   Angola       TRUE
## 3      Africa                    Benin       TRUE
## 4      Africa                 Botswana       TRUE
## 5      Africa             Burkina Faso       TRUE
## 6      Africa                  Burundi       TRUE
## 7      Africa                 Cameroon       TRUE
## 8      Africa Central African Republic       TRUE
## 9      Africa                     Chad       TRUE
## 10     Africa                  Comoros       TRUE
## 11     Africa         Congo, Dem. Rep.       TRUE
## 12     Africa              Congo, Rep.       TRUE
## 13     Africa            Cote d'Ivoire       TRUE
## 14     Africa                 Djibouti       TRUE
## 15     Africa                    Egypt      FALSE
## 16     Africa        Equatorial Guinea       TRUE
## 17     Africa                  Eritrea       TRUE
## 18     Africa                 Ethiopia       TRUE
## 19     Africa                    Gabon       TRUE
## 20     Africa                   Gambia       TRUE
## 21     Africa                    Ghana       TRUE
## 22     Africa                   Guinea       TRUE
## 23     Africa            Guinea-Bissau       TRUE
## 24     Africa                    Kenya       TRUE
## 25     Africa                  Lesotho       TRUE
## 26     Africa                  Liberia       TRUE
## 27     Africa                    Libya      FALSE
## 28     Africa               Madagascar       TRUE
## 29     Africa                   Malawi       TRUE
## 30     Africa                     Mali       TRUE
## 31     Africa               Mauritania       TRUE
## 32     Africa                Mauritius      FALSE
## 33     Africa                  Morocco      FALSE
## 34     Africa               Mozambique       TRUE
## 35     Africa                  Namibia       TRUE
## 36     Africa                    Niger       TRUE
## 37     Africa                  Nigeria       TRUE
## 38     Africa                  Reunion      FALSE
## 39     Africa                   Rwanda       TRUE
## 40     Africa    Sao Tome and Principe       TRUE
## 41     Africa                  Senegal       TRUE
## 42     Africa             Sierra Leone       TRUE
## 43     Africa                  Somalia       TRUE
## 44     Africa             South Africa       TRUE
## 45     Africa                    Sudan       TRUE
## 46     Africa                Swaziland       TRUE
## 47     Africa                 Tanzania       TRUE
## 48     Africa                     Togo       TRUE
## 49     Africa                  Tunisia      FALSE
## 50     Africa                   Uganda       TRUE
## 51     Africa                   Zambia       TRUE
## 52     Africa                 Zimbabwe       TRUE
## 53   Americas                Argentina      FALSE
## 54   Americas                  Bolivia       TRUE
## 55   Americas                   Brazil      FALSE
## 56   Americas                   Canada      FALSE
## 57   Americas                    Chile      FALSE
## 58   Americas                 Colombia      FALSE
## 59   Americas               Costa Rica      FALSE
## 60   Americas                     Cuba      FALSE
## 61   Americas       Dominican Republic      FALSE
## 62   Americas                  Ecuador      FALSE
## 63   Americas              El Salvador      FALSE
## 64   Americas                Guatemala      FALSE
## 65   Americas                    Haiti       TRUE
## 66   Americas                 Honduras      FALSE
## 67   Americas                  Jamaica      FALSE
## 68   Americas                   Mexico      FALSE
## 69   Americas                Nicaragua      FALSE
## 70   Americas                   Panama      FALSE
## 71   Americas                 Paraguay      FALSE
## 72   Americas                     Peru      FALSE
## 73   Americas              Puerto Rico      FALSE
## 74   Americas      Trinidad and Tobago      FALSE
## 75   Americas            United States      FALSE
## 76   Americas                  Uruguay      FALSE
## 77   Americas                Venezuela      FALSE
## 78       Asia              Afghanistan       TRUE
## 79       Asia                  Bahrain      FALSE
## 80       Asia               Bangladesh       TRUE
## 81       Asia                 Cambodia       TRUE
## 82       Asia                    China      FALSE
## 83       Asia         Hong Kong, China      FALSE
## 84       Asia                    India       TRUE
## 85       Asia                Indonesia      FALSE
## 86       Asia                     Iran      FALSE
## 87       Asia                     Iraq       TRUE
## 88       Asia                   Israel      FALSE
## 89       Asia                    Japan      FALSE
## 90       Asia                   Jordan      FALSE
## 91       Asia         Korea, Dem. Rep.      FALSE
## 92       Asia              Korea, Rep.      FALSE
## 93       Asia                   Kuwait      FALSE
## 94       Asia                  Lebanon      FALSE
## 95       Asia                 Malaysia      FALSE
## 96       Asia                 Mongolia       TRUE
## 97       Asia                  Myanmar       TRUE
## 98       Asia                    Nepal       TRUE
## 99       Asia                     Oman      FALSE
## 100      Asia                 Pakistan       TRUE
## 101      Asia              Philippines      FALSE
## 102      Asia             Saudi Arabia      FALSE
## 103      Asia                Singapore      FALSE
## 104      Asia                Sri Lanka      FALSE
## 105      Asia                    Syria      FALSE
## 106      Asia                   Taiwan      FALSE
## 107      Asia                 Thailand      FALSE
## 108      Asia                  Vietnam      FALSE
## 109      Asia       West Bank and Gaza      FALSE
## 110      Asia              Yemen, Rep.       TRUE
## 111    Europe                  Albania      FALSE
## 112    Europe                  Austria      FALSE
## 113    Europe                  Belgium      FALSE
## 114    Europe   Bosnia and Herzegovina      FALSE
## 115    Europe                 Bulgaria      FALSE
## 116    Europe                  Croatia      FALSE
## 117    Europe           Czech Republic      FALSE
## 118    Europe                  Denmark      FALSE
## 119    Europe                  Finland      FALSE
## 120    Europe                   France      FALSE
## 121    Europe                  Germany      FALSE
## 122    Europe                   Greece      FALSE
## 123    Europe                  Hungary      FALSE
## 124    Europe                  Iceland      FALSE
## 125    Europe                  Ireland      FALSE
## 126    Europe                    Italy      FALSE
## 127    Europe               Montenegro      FALSE
## 128    Europe              Netherlands      FALSE
## 129    Europe                   Norway      FALSE
## 130    Europe                   Poland      FALSE
## 131    Europe                 Portugal      FALSE
## 132    Europe                  Romania      FALSE
## 133    Europe                   Serbia      FALSE
## 134    Europe          Slovak Republic      FALSE
## 135    Europe                 Slovenia      FALSE
## 136    Europe                    Spain      FALSE
## 137    Europe                   Sweden      FALSE
## 138    Europe              Switzerland      FALSE
## 139    Europe                   Turkey      FALSE
## 140    Europe           United Kingdom      FALSE
## 141   Oceania                Australia      FALSE
## 142   Oceania              New Zealand      FALSE

Here, what I did was ask R to find which countries have life expectancy below 67. As you can see, the output is a logical vector of TRUEs and FALSEs.

nCountriesLowLifeExpByCont <- ddply(Dat2007LowLifeExp, ~continent, summarize, 
    nCountriesLowLifeExp = length(which(lowLifeExp == TRUE)))
nCountriesLowLifeExpByCont
##   continent nCountriesLowLifeExp
## 1    Africa                   45
## 2  Americas                    2
## 3      Asia                   10
## 4    Europe                    0
## 5   Oceania                    0

To find the number of countries per continent with low life expectancy, I asked R to count the number of times “TRUE” showed up under “lowLifeExp”. In R language, this means using length() for which() are “TRUE”“ in the "lowLifeExp” vector. Personal story: Figuring this out turned out to be a source of great pain and frustration, and took an embarassing amount of time.

4.2) Now that I have the hang of it, I am moving onto the entire dataset. I will use an arbitrary life expectancy of 50 as the threshold for low life expectancy.

DatLowLifeExp <- ddply(Dat, ~continent + country + year, summarize, lowLifeExp = lifeExp < 
    50)

First, we are looking at the number of countries with low life expectancy per continent.

nCountriesLowLifeExpByContandYear.Wide <- daply(DatLowLifeExp, ~year + continent, 
    summarize, nCountriesLowLifeExp = length(which(lowLifeExp == TRUE)))
nCountriesLowLifeExpByContandYear.Wide
##       continent
## year   Africa Americas Asia Europe Oceania
##   1952 50     9        22   1      0      
##   1957 49     8        18   1      0      
##   1962 47     6        17   0      0      
##   1967 39     2        12   0      0      
##   1972 36     2        6    0      0      
##   1977 28     1        5    0      0      
##   1982 24     0        3    0      0      
##   1987 20     0        1    0      0      
##   1992 20     0        1    0      0      
##   1997 20     0        1    0      0      
##   2002 22     0        1    0      0      
##   2007 18     0        1    0      0

Now, we are looking at the proportion of countries with low life expectancy per continent.

propCountriesLowLifeExpByContAndYear.Wide <- daply(DatLowLifeExp, ~year + continent, 
    summarize, propLowLifeExp = (length(which(lowLifeExp == TRUE))/length(unique(country))))
propCountriesLowLifeExpByContAndYear.Wide
##       continent
## year   Africa Americas Asia    Europe  Oceania
##   1952 0.9615 0.36     0.6667  0.03333 0      
##   1957 0.9423 0.32     0.5455  0.03333 0      
##   1962 0.9038 0.24     0.5152  0       0      
##   1967 0.75   0.08     0.3636  0       0      
##   1972 0.6923 0.08     0.1818  0       0      
##   1977 0.5385 0.04     0.1515  0       0      
##   1982 0.4615 0        0.09091 0       0      
##   1987 0.3846 0        0.0303  0       0      
##   1992 0.3846 0        0.0303  0       0      
##   1997 0.3846 0        0.0303  0       0      
##   2002 0.4231 0        0.0303  0       0      
##   2007 0.3462 0        0.0303  0       0

To generate a data frame with both the number and proportion of countries with low life expectancy per continent, we include them all in our output.

propAndNCountriesLowLifeExpByContAndYear <- ddply(DatLowLifeExp, ~continent + 
    year, summarize, nCountriesLowLifeExp = length(which(lowLifeExp == TRUE)), 
    nCountries = length(unique(country)), propLowLifeExp = (length(which(lowLifeExp == 
        TRUE))/length(unique(country))))
propAndNCountriesLowLifeExpByContAndYear
##    continent year nCountriesLowLifeExp nCountries propLowLifeExp
## 1     Africa 1952                   50         52        0.96154
## 2     Africa 1957                   49         52        0.94231
## 3     Africa 1962                   47         52        0.90385
## 4     Africa 1967                   39         52        0.75000
## 5     Africa 1972                   36         52        0.69231
## 6     Africa 1977                   28         52        0.53846
## 7     Africa 1982                   24         52        0.46154
## 8     Africa 1987                   20         52        0.38462
## 9     Africa 1992                   20         52        0.38462
## 10    Africa 1997                   20         52        0.38462
## 11    Africa 2002                   22         52        0.42308
## 12    Africa 2007                   18         52        0.34615
## 13  Americas 1952                    9         25        0.36000
## 14  Americas 1957                    8         25        0.32000
## 15  Americas 1962                    6         25        0.24000
## 16  Americas 1967                    2         25        0.08000
## 17  Americas 1972                    2         25        0.08000
## 18  Americas 1977                    1         25        0.04000
## 19  Americas 1982                    0         25        0.00000
## 20  Americas 1987                    0         25        0.00000
## 21  Americas 1992                    0         25        0.00000
## 22  Americas 1997                    0         25        0.00000
## 23  Americas 2002                    0         25        0.00000
## 24  Americas 2007                    0         25        0.00000
## 25      Asia 1952                   22         33        0.66667
## 26      Asia 1957                   18         33        0.54545
## 27      Asia 1962                   17         33        0.51515
## 28      Asia 1967                   12         33        0.36364
## 29      Asia 1972                    6         33        0.18182
## 30      Asia 1977                    5         33        0.15152
## 31      Asia 1982                    3         33        0.09091
## 32      Asia 1987                    1         33        0.03030
## 33      Asia 1992                    1         33        0.03030
## 34      Asia 1997                    1         33        0.03030
## 35      Asia 2002                    1         33        0.03030
## 36      Asia 2007                    1         33        0.03030
## 37    Europe 1952                    1         30        0.03333
## 38    Europe 1957                    1         30        0.03333
## 39    Europe 1962                    0         30        0.00000
## 40    Europe 1967                    0         30        0.00000
## 41    Europe 1972                    0         30        0.00000
## 42    Europe 1977                    0         30        0.00000
## 43    Europe 1982                    0         30        0.00000
## 44    Europe 1987                    0         30        0.00000
## 45    Europe 1992                    0         30        0.00000
## 46    Europe 1997                    0         30        0.00000
## 47    Europe 2002                    0         30        0.00000
## 48    Europe 2007                    0         30        0.00000
## 49   Oceania 1952                    0          2        0.00000
## 50   Oceania 1957                    0          2        0.00000
## 51   Oceania 1962                    0          2        0.00000
## 52   Oceania 1967                    0          2        0.00000
## 53   Oceania 1972                    0          2        0.00000
## 54   Oceania 1977                    0          2        0.00000
## 55   Oceania 1982                    0          2        0.00000
## 56   Oceania 1987                    0          2        0.00000
## 57   Oceania 1992                    0          2        0.00000
## 58   Oceania 1997                    0          2        0.00000
## 59   Oceania 2002                    0          2        0.00000
## 60   Oceania 2007                    0          2        0.00000

Overall, the number/proportion of countries with low life expectancy is decreasing over time for all continents. However, there are stark gradients across continents - Africa still has a rather large proportion of countries with low life expectancy compared to all other continents.

In future exercises, I would like to investigate the relationship between GDP and life expectancy over time by continent. I would also like to identify countries that have “interesting stories” in this regard.

To sum: From this exercise, I have gathered that “plyr” is a very useful package and can imagine various handy uses of it. Personal reflection: I've also realized that at my current level of R programming, using R is like trying to write a book in a foreign language without understanding grammar and having a limited/non-existent vocabulary. Here's to becoming R-literate!