Raven Shan


Introduction

For this assignment, I will be utilizing a subset of a life expectancy dataset from Gapminder.org, which can be retrieved at this link. The dataset contains 1704 observations of 6 variables. I will be exploring life expectancy trends between the years 1952 and 2007. The overall objective of this assignment is to use figures generated by the GGplot2 package to tell an interesting story. This assignment seeks to answer three fundamental research questions. The first set of graphs will examine how life expectancy has changed over the years. Has it increased or decreased over time? Secondly, how has life expectancy changed across different continents? Is it steadily increasing for some continents and not others? Finally, I will explore the relationship between GDP per capita and life expectancy across each continent. For this assignment, I decided to primarily use geom_smooth() for each figure as I think it best fits the data. A boxplot or bar chart, for example, does not convey progression over time as effectively, in my opinion.


Data and Variables

The final variables used in this analysis are as follows:

Observations: 1,680
Variables: 5
$ year      <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, ...
$ continent <chr> "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asi...
$ lifeExp   <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40.8...
$ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134, ...
$ country   <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", ...

Results

Figure 1. Life Expectancy Over Time

Below, I start off by examining life expectancy over the years with a simple figure using geom_point and geom_smooth. First, I create the ggplot object containing the data and the aesthetic mapping, which will act as the base of the figure. Then I add the plotting layers and specify the color and size to generate figure 1.

g <- ggplot(life_exp, aes(x = year, y = lifeExp))
g1 <- g + geom_smooth(color="blue", size=1.5) + geom_point(alpha=.3) +
  theme_classic() +
  ggtitle("Overall Life Expectancy over Time")
g1


Figure 2. Life Expectancy over Time in the United States

In the following figure, I am examining life expectancy over time in the United States using a non-parametric smoothing line via geom_smooth().

g2 <- ggplot(USdata, aes(x = year, y = lifeExp))+
  geom_smooth(color="red", size=1.5) + geom_point(color="blue") +
  theme_classic() +
  ggtitle("Life Expectancy in the US over Time")
g2

Comparing Life Expectancy in the US with Overall Life Expectancy

Next, I used the grid.arrange() function from the gridExtra package, which allowed me to display each graph side by side in order to compare them more effectively.

library(gridExtra)
grid.arrange(g2,g1, ncol=2)


Figure 3. Life Expectancy over Time by Continent

The following figure incorporates continents. I am examining the life expectancy over time across each continent using a scatterplot overlaid with a smoothed line. I assigned an alpha value of .3 to adjust the opacity of the points.

g3 <- ggplot(life_exp, aes(x = year, y = lifeExp,color=continent)) 
g3 + geom_point(alpha=.3) + geom_smooth(size=1.5) +
  theme_classic() +
  ggtitle("Life Expectancy over Time by Continent")

Figure 3b.

This figure does not present any new information as the previous one, however, it does give viewers a different visual representation of the data. In this figure, I use the geom_density() function to display the life expectancy in each continent. In addition, I utilized the the facet_wrap() function and specified the continent variable to display each continent in its own distinct section.

g3b <- ggplot(life_exp, aes(x=lifeExp))
g3b + geom_density (aes(fill=continent), size=1) + facet_wrap(~continent)+
  theme_classic() +
  ggtitle("Life Expectancy Across Continents")


Figure 4. Relationship between GDP per Capita and Life Expectancy by Continent

The last figure introduces a new independent variable, GDP per capita. It explores the relationship between GDP per capita (grouped by continents rather than countries) and life expectancy for each continent. In addition, after trial and error, I decided to log transform the gdpPercap variable to create an interpretable graph. Before log transforming the variable, the graph was uninterpretable and visually unappealing.

g.g <-ggplot(life_exp, aes(x = log10(gdpPercap), y = lifeExp, color=continent))
g4 <- g.g + geom_point(size=2) +
  theme_classic() +
  ggtitle("Life Expectancy and GDP/capita by Continent")
g4

LS0tDQp0aXRsZTogIlNvYyBIVyAjOSAtIERhdGEgVmlzdWFsaXphdGlvbiB1c2luZyBHR3Bsb3QyIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCipSYXZlbiBTaGFuKg0KDQotLS0NCg0KIyNJbnRyb2R1Y3Rpb24gDQoNCkZvciB0aGlzIGFzc2lnbm1lbnQsIEkgd2lsbCBiZSB1dGlsaXppbmcgYSBzdWJzZXQgb2YgYSBsaWZlIGV4cGVjdGFuY3kgZGF0YXNldCBmcm9tIEdhcG1pbmRlci5vcmcsIHdoaWNoIGNhbiBiZSByZXRyaWV2ZWQgYXQgdGhpcyBbbGlua10od3d3LnN0YXQudWJjLmNhL35qZW5ueS9ub3RPY3RvL1NUQVQ1NDVBL2V4YW1wbGVzL2dhcG1pbmRlci9kYXRhL2dhcG1pbmRlckRhdGFGaXZlWWVhci50eHQpLiBUaGUgZGF0YXNldCBjb250YWlucyAxNzA0IG9ic2VydmF0aW9ucyBvZiA2IHZhcmlhYmxlcy4gSSB3aWxsIGJlIGV4cGxvcmluZyBsaWZlIGV4cGVjdGFuY3kgdHJlbmRzIGJldHdlZW4gdGhlIHllYXJzIDE5NTIgYW5kIDIwMDcuIFRoZSBvdmVyYWxsIG9iamVjdGl2ZSBvZiB0aGlzIGFzc2lnbm1lbnQgaXMgdG8gdXNlIGZpZ3VyZXMgZ2VuZXJhdGVkIGJ5IHRoZSAqR0dwbG90MiogcGFja2FnZSB0byB0ZWxsIGFuIGludGVyZXN0aW5nIHN0b3J5LiBUaGlzIGFzc2lnbm1lbnQgc2Vla3MgdG8gYW5zd2VyIHRocmVlIGZ1bmRhbWVudGFsIHJlc2VhcmNoIHF1ZXN0aW9ucy4gVGhlIGZpcnN0IHNldCBvZiBncmFwaHMgd2lsbCBleGFtaW5lIGhvdyBsaWZlIGV4cGVjdGFuY3kgaGFzIGNoYW5nZWQgb3ZlciB0aGUgeWVhcnMuIEhhcyBpdCBpbmNyZWFzZWQgb3IgZGVjcmVhc2VkIG92ZXIgdGltZT8gU2Vjb25kbHksIGhvdyBoYXMgbGlmZSBleHBlY3RhbmN5IGNoYW5nZWQgYWNyb3NzIGRpZmZlcmVudCBjb250aW5lbnRzPyBJcyBpdCBzdGVhZGlseSBpbmNyZWFzaW5nIGZvciBzb21lIGNvbnRpbmVudHMgYW5kIG5vdCBvdGhlcnM/IEZpbmFsbHksIEkgd2lsbCBleHBsb3JlIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBHRFAgcGVyIGNhcGl0YSBhbmQgbGlmZSBleHBlY3RhbmN5IGFjcm9zcyBlYWNoIGNvbnRpbmVudC4gRm9yIHRoaXMgYXNzaWdubWVudCwgSSBkZWNpZGVkIHRvIHByaW1hcmlseSB1c2UgZ2VvbV9zbW9vdGgoKSBmb3IgZWFjaCBmaWd1cmUgYXMgSSB0aGluayBpdCBiZXN0IGZpdHMgdGhlIGRhdGEuIEEgYm94cGxvdCBvciBiYXIgY2hhcnQsIGZvciBleGFtcGxlLCBkb2VzIG5vdCBjb252ZXkgcHJvZ3Jlc3Npb24gb3ZlciB0aW1lIGFzIGVmZmVjdGl2ZWx5LCBpbiBteSBvcGluaW9uLiANCg0KYGBge3IsIGVjaG89RkFMU0V9DQpsaWJyYXJ5KGRhdGEudGFibGUpDQpsaWZlX2V4cDIgPC0gZnJlYWQgKCdodHRwczovL3d3dy5zdGF0LnViYy5jYS9+amVubnkvbm90T2N0by9TVEFUNTQ1QS9leGFtcGxlcy9nYXBtaW5kZXIvZGF0YS9nYXBtaW5kZXJEYXRhRml2ZVllYXIudHh0JykNCmBgYA0KDQpgYGB7ciwgZWNobz1GQUxTRX0NCmxpYnJhcnkoZ2dwbG90MikNCm9wdGlvbnMoZHBseXIuc2hvd19wcm9ncmVzcyA9IEZBTFNFKQ0KYGBgDQoNCi0tLQ0KDQojI0RhdGEgYW5kIFZhcmlhYmxlcyANCg0KYGBge3IsIGVjaG89RkFMU0V9DQpsaWJyYXJ5KGRwbHlyKQ0KbGlmZV9leHAgPC0gZHBseXI6OnNlbGVjdChsaWZlX2V4cDIsIHllYXIsIGNvbnRpbmVudCwgbGlmZUV4cCwgZ2RwUGVyY2FwLCBjb3VudHJ5KQ0KYGBgDQoNCmBgYHtyLCBlY2hvPUZBTFNFfQ0KbGlicmFyeShwbHlyKQ0KbGlmZV9leHAgPC0gZHJvcGxldmVscyhzdWJzZXQobGlmZV9leHAsIGNvbnRpbmVudCE9Ik9jZWFuaWEiKSkNCmBgYA0KDQpUaGUgZmluYWwgdmFyaWFibGVzIHVzZWQgaW4gdGhpcyBhbmFseXNpcyBhcmUgYXMgZm9sbG93czoNCg0KKiAqKmxpZmVFeHAqKjogTGlmZSBleHBlY3RhbmN5IHdpbGwgc2VydmUgYXMgdGhlIGRlcGVuZGVudCB2YXJpYWJsZSBpbiB0aGlzIGFuYWx5c2lzLiBUaGlzIG1lYXN1cmVzIGxpZmUgZXhwZWN0YW5jeSBhdCBiaXJ0aCAoaW4geWVhcnMpLg0KKiAqKlllYXIqKjogVGhlIHllYXJzIGluIHRoaXMgZGF0YXNldCByYW5nZSBmcm9tIDE5NTAgdG8gMjAwNyAoaW4gaW5jcmVtZW50cyBvZiA1KS4gDQoqICoqZ2RwUGVyY2FwKio6IFRoaXMgdmFyaWFibGUgbWVhc3VyZXMgcGVyIGNhcGl0YSBHRFAgb2YgYSBjb3VudHJ5IGluIGEgcGFydGljdWxhciB5ZWFyLg0KKiAqKmNvdW50cnkqKjogVGhpcyB2YXJpYWJsZSBjb250YWlucyBlYWNoIGNvdW50cnkuDQoqICoqY29udGluZW50Kio6IFRoZSBjb250aW5lbnRzIGluIHRoaXMgZGF0YXNldCBjb25zaXN0IG9mIEFzaWEsIEV1cm9wZSwgQWZyaWNhLCBBbWVyaWNhcyAoT2NlYW5pYSB3YXMgbm90IGluY2x1ZGVkIGluIHRoZSBmaW5hbCBkYXRhc2V0IGFzIGl0IGNvbnRhaW5lZCBvbmx5IDIgY291bnRyaWVzKS4NCg0KYGBge3IsIGVjaG89RkFMU0V9DQpkcGx5cjo6Z2xpbXBzZShsaWZlX2V4cCkNCmBgYA0KDQotLS0NCg0KIyNSZXN1bHRzDQoNCiMjRmlndXJlIDEuIExpZmUgRXhwZWN0YW5jeSBPdmVyIFRpbWUNCg0KQmVsb3csIEkgc3RhcnQgb2ZmIGJ5IGV4YW1pbmluZyBsaWZlIGV4cGVjdGFuY3kgb3ZlciB0aGUgeWVhcnMgd2l0aCBhIHNpbXBsZSBmaWd1cmUgdXNpbmcgZ2VvbV9wb2ludCBhbmQgZ2VvbV9zbW9vdGguIEZpcnN0LCBJIGNyZWF0ZSB0aGUgZ2dwbG90IG9iamVjdCBjb250YWluaW5nIHRoZSBkYXRhIGFuZCB0aGUgYWVzdGhldGljIG1hcHBpbmcsIHdoaWNoIHdpbGwgYWN0IGFzIHRoZSBiYXNlIG9mIHRoZSBmaWd1cmUuIFRoZW4gSSBhZGQgdGhlIHBsb3R0aW5nIGxheWVycyBhbmQgc3BlY2lmeSB0aGUgY29sb3IgYW5kIHNpemUgdG8gZ2VuZXJhdGUgZmlndXJlIDEuIA0KYGBge3IgZmlnLndpZHRoPTl9DQpnIDwtIGdncGxvdChsaWZlX2V4cCwgYWVzKHggPSB5ZWFyLCB5ID0gbGlmZUV4cCkpDQpnMSA8LSBnICsgZ2VvbV9zbW9vdGgoY29sb3I9ImJsdWUiLCBzaXplPTEuNSkgKyBnZW9tX3BvaW50KGFscGhhPS4zKSArDQogIHRoZW1lX2NsYXNzaWMoKSArDQogIGdndGl0bGUoIk92ZXJhbGwgTGlmZSBFeHBlY3RhbmN5IG92ZXIgVGltZSIpDQpnMQ0KYGBgDQoNCi0tLQ0KDQojI0ZpZ3VyZSAyLiBMaWZlIEV4cGVjdGFuY3kgb3ZlciBUaW1lIGluIHRoZSBVbml0ZWQgU3RhdGVzDQoNCkluIHRoZSBmb2xsb3dpbmcgZmlndXJlLCBJIGFtIGV4YW1pbmluZyBsaWZlIGV4cGVjdGFuY3kgb3ZlciB0aW1lIGluIHRoZSBVbml0ZWQgU3RhdGVzIHVzaW5nIGEgbm9uLXBhcmFtZXRyaWMgc21vb3RoaW5nIGxpbmUgdmlhIGdlb21fc21vb3RoKCkuIA0KDQpgYGB7ciwgZWNobz1GQUxTRX0NClVTZGF0YSA8LSBmaWx0ZXIobGlmZV9leHAsIGNvdW50cnk9PSJVbml0ZWQgU3RhdGVzIikNCmBgYA0KDQpgYGB7ciBmaWcud2lkdGg9OX0NCmcyIDwtIGdncGxvdChVU2RhdGEsIGFlcyh4ID0geWVhciwgeSA9IGxpZmVFeHApKSsNCiAgZ2VvbV9zbW9vdGgoY29sb3I9InJlZCIsIHNpemU9MS41KSArIGdlb21fcG9pbnQoY29sb3I9ImJsdWUiKSArDQogIHRoZW1lX2NsYXNzaWMoKSArDQogIGdndGl0bGUoIkxpZmUgRXhwZWN0YW5jeSBpbiB0aGUgVVMgb3ZlciBUaW1lIikNCmcyDQpgYGANCg0KIyNDb21wYXJpbmcgTGlmZSBFeHBlY3RhbmN5IGluIHRoZSBVUyB3aXRoIE92ZXJhbGwgTGlmZSBFeHBlY3RhbmN5IA0KDQpOZXh0LCBJIHVzZWQgdGhlIGdyaWQuYXJyYW5nZSgpIGZ1bmN0aW9uIGZyb20gdGhlICpncmlkRXh0cmEqIHBhY2thZ2UsIHdoaWNoIGFsbG93ZWQgbWUgdG8gZGlzcGxheSBlYWNoIGdyYXBoIHNpZGUgYnkgc2lkZSBpbiBvcmRlciB0byBjb21wYXJlIHRoZW0gbW9yZSBlZmZlY3RpdmVseS4gIA0KDQpgYGB7ciBmaWcud2lkdGg9OX0NCmxpYnJhcnkoZ3JpZEV4dHJhKQ0KZ3JpZC5hcnJhbmdlKGcyLGcxLCBuY29sPTIpDQpgYGANCg0KLS0tDQoNCiMjRmlndXJlIDMuIExpZmUgRXhwZWN0YW5jeSBvdmVyIFRpbWUgYnkgQ29udGluZW50IA0KDQpUaGUgZm9sbG93aW5nIGZpZ3VyZSBpbmNvcnBvcmF0ZXMgY29udGluZW50cy4gSSBhbSBleGFtaW5pbmcgdGhlIGxpZmUgZXhwZWN0YW5jeSBvdmVyIHRpbWUgYWNyb3NzIGVhY2ggY29udGluZW50IHVzaW5nIGEgc2NhdHRlcnBsb3Qgb3ZlcmxhaWQgd2l0aCBhIHNtb290aGVkIGxpbmUuIEkgYXNzaWduZWQgYW4gYWxwaGEgdmFsdWUgb2YgLjMgdG8gYWRqdXN0IHRoZSBvcGFjaXR5IG9mIHRoZSBwb2ludHMuDQoNCmBgYHtyIGZpZy53aWR0aD0xMH0NCmczIDwtIGdncGxvdChsaWZlX2V4cCwgYWVzKHggPSB5ZWFyLCB5ID0gbGlmZUV4cCxjb2xvcj1jb250aW5lbnQpKSANCmczICsgZ2VvbV9wb2ludChhbHBoYT0uMykgKyBnZW9tX3Ntb290aChzaXplPTEuNSkgKw0KICB0aGVtZV9jbGFzc2ljKCkgKw0KICBnZ3RpdGxlKCJMaWZlIEV4cGVjdGFuY3kgb3ZlciBUaW1lIGJ5IENvbnRpbmVudCIpDQpgYGANCg0KIyMjRmlndXJlIDNiLg0KDQpUaGlzIGZpZ3VyZSBkb2VzIG5vdCBwcmVzZW50IGFueSBuZXcgaW5mb3JtYXRpb24gYXMgdGhlIHByZXZpb3VzIG9uZSwgaG93ZXZlciwgaXQgZG9lcyBnaXZlIHZpZXdlcnMgYSBkaWZmZXJlbnQgdmlzdWFsIHJlcHJlc2VudGF0aW9uIG9mIHRoZSBkYXRhLiBJbiB0aGlzIGZpZ3VyZSwgSSB1c2UgdGhlIGdlb21fZGVuc2l0eSgpIGZ1bmN0aW9uIHRvIGRpc3BsYXkgdGhlIGxpZmUgZXhwZWN0YW5jeSBpbiBlYWNoIGNvbnRpbmVudC4gSW4gYWRkaXRpb24sIEkgdXRpbGl6ZWQgdGhlIHRoZSBmYWNldF93cmFwKCkgZnVuY3Rpb24gYW5kIHNwZWNpZmllZCB0aGUgY29udGluZW50IHZhcmlhYmxlIHRvIGRpc3BsYXkgZWFjaCBjb250aW5lbnQgaW4gaXRzIG93biBkaXN0aW5jdCBzZWN0aW9uLg0KDQpgYGB7ciBmaWcud2lkdGg9MTB9DQpnM2IgPC0gZ2dwbG90KGxpZmVfZXhwLCBhZXMoeD1saWZlRXhwKSkNCmczYiArIGdlb21fZGVuc2l0eSAoYWVzKGZpbGw9Y29udGluZW50KSwgc2l6ZT0xKSArIGZhY2V0X3dyYXAofmNvbnRpbmVudCkrDQogIHRoZW1lX2NsYXNzaWMoKSArDQogIGdndGl0bGUoIkxpZmUgRXhwZWN0YW5jeSBBY3Jvc3MgQ29udGluZW50cyIpDQpgYGANCg0KLS0tDQoNCiMjRmlndXJlIDQuIFJlbGF0aW9uc2hpcCBiZXR3ZWVuIEdEUCBwZXIgQ2FwaXRhIGFuZCBMaWZlIEV4cGVjdGFuY3kgYnkgQ29udGluZW50DQoNClRoZSBsYXN0IGZpZ3VyZSBpbnRyb2R1Y2VzIGEgbmV3IGluZGVwZW5kZW50IHZhcmlhYmxlLCBHRFAgcGVyIGNhcGl0YS4gSXQgZXhwbG9yZXMgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIEdEUCBwZXIgY2FwaXRhICgqZ3JvdXBlZCBieSBjb250aW5lbnRzIHJhdGhlciB0aGFuIGNvdW50cmllcyopIGFuZCBsaWZlIGV4cGVjdGFuY3kgZm9yIGVhY2ggY29udGluZW50LiBJbiBhZGRpdGlvbiwgYWZ0ZXIgdHJpYWwgYW5kIGVycm9yLCBJIGRlY2lkZWQgdG8gbG9nIHRyYW5zZm9ybSB0aGUgZ2RwUGVyY2FwIHZhcmlhYmxlIHRvIGNyZWF0ZSBhbiBpbnRlcnByZXRhYmxlIGdyYXBoLiBCZWZvcmUgbG9nIHRyYW5zZm9ybWluZyB0aGUgdmFyaWFibGUsIHRoZSBncmFwaCB3YXMgdW5pbnRlcnByZXRhYmxlIGFuZCB2aXN1YWxseSB1bmFwcGVhbGluZy4gIA0KDQpgYGB7ciBmaWcud2lkdGg9MTB9DQpnLmcgPC1nZ3Bsb3QobGlmZV9leHAsIGFlcyh4ID0gbG9nMTAoZ2RwUGVyY2FwKSwgeSA9IGxpZmVFeHAsIGNvbG9yPWNvbnRpbmVudCkpDQpnNCA8LSBnLmcgKyBnZW9tX3BvaW50KHNpemU9MikgKw0KICB0aGVtZV9jbGFzc2ljKCkgKw0KICBnZ3RpdGxlKCJMaWZlIEV4cGVjdGFuY3kgYW5kIEdEUC9jYXBpdGEgYnkgQ29udGluZW50IikNCmc0DQpgYGANCg0K