Variable assignment
Throughout the exercises in this chapter, you’ll be visualizing a subset of the gapminder data from the year 1952. First, you’ll have to load the ggplot2 package, and create a gapminder_1952 dataset to visualize.
Load the ggplot2 package after the gapminder and dplyr packages. Filter gapminder for observations from the year 1952, and assign it to a new dataset gapminder_1952 using the assignment operator (<-).
HINT To assign a new dataset, you would do gapminder_1952 <-, then create your filtered dataset with dplyr’s filter().
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIExvYWQgdGhlIGdncGxvdDIgcGFja2FnZSBhcyB3ZWxsXG5saWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5cblxuIyBDcmVhdGUgZ2FwbWluZGVyXzE5NTIiLCJzb2x1dGlvbiI6IiMgTG9hZCB0aGUgZ2dwbG90MiBwYWNrYWdlIGFzIHdlbGxcbmxpYnJhcnkoZ2FwbWluZGVyKVxubGlicmFyeShkcGx5cilcbmxpYnJhcnkoZ2dwbG90MilcblxuIyBDcmVhdGUgZ2FwbWluZGVyXzE5NTJcbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MikifQ==
Comparing population and GDP per capita
In the video you learned to create a scatter plot with GDP per capita on the x-axis and life expectancy on the y-axis (the code for that graph is shown here). When you’re exploring data visually, you’ll often need to try different combinations of variables and aesthetics.
Change the scatter plot of gapminder_1952 so that (pop) is on the x-axis and GDP per capita (gdpPercap) is on the y-axis.
HINT You’ll need to change what’s in aes(): for example x = gdpPercap will be changed to y = lifeExp.
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBDaGFuZ2UgdG8gcHV0IHBvcCBvbiB0aGUgeC1heGlzIGFuZCBnZHBQZXJjYXAgb24gdGhlIHktYXhpc1xuZ2dwbG90KGdhcG1pbmRlcl8xOTUyLCBhZXMoeCA9IGdkcFBlcmNhcCwgeSA9IGxpZmVFeHApKSArXG4gIGdlb21fcG9pbnQoKSIsInNvbHV0aW9uIjoibGlicmFyeShnYXBtaW5kZXIpXG5saWJyYXJ5KGRwbHlyKVxubGlicmFyeShnZ3Bsb3QyKVxuXG5nYXBtaW5kZXJfMTk1MiA8LSBnYXBtaW5kZXIgJT4lXG4gIGZpbHRlcih5ZWFyID09IDE5NTIpXG5cbiMgQ2hhbmdlIHRvIHB1dCBwb3Agb24gdGhlIHgtYXhpcyBhbmQgZ2RwUGVyY2FwIG9uIHRoZSB5LWF4aXNcbmdncGxvdChnYXBtaW5kZXJfMTk1MiwgYWVzKHggPSBwb3AsIHkgPSBnZHBQZXJjYXApKSArXG4gIGdlb21fcG9pbnQoKSJ9
Comparing population and life expectancy In this exercise, you’ll use ggplot2 to create a scatter plot from scratch, to compare each country’s population with its life expectancy in the year 1952.
Create a scatter plot of gapminder_1952 with population (pop) is on the x-axis and life expectancy (lifeExp) on the y-axis.
HINT Recall that there are three parts to a ggplot2 call: the data (gapminder_1952), the aesthetic mapping (created with aes(x = …), and the layer (+ geom_point()).
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBDcmVhdGUgYSBzY2F0dGVyIHBsb3Qgd2l0aCBwb3Agb24gdGhlIHgtYXhpcyBhbmQgbGlmZUV4cCBvbiB0aGUgeS1heGlzIiwic29sdXRpb24iOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBDcmVhdGUgYSBzY2F0dGVyIHBsb3Qgd2l0aCBwb3Agb24gdGhlIHgtYXhpcyBhbmQgbGlmZUV4cCBvbiB0aGUgeS1heGlzXG5nZ3Bsb3QoZ2FwbWluZGVyXzE5NTIsIGFlcyh4ID0gcG9wLCB5ID0gbGlmZUV4cCkpICtcbiAgZ2VvbV9wb2ludCgpIn0=
Putting the x-axis on a log scale
You previously created a scatter plot with population on the x-axis and life expectancy on the y-axis. Since population is spread over several orders of magnitude, with some countries having a much higher population than others, it’s a good idea to put the x-axis on a log scale.
Change the existing scatter plot (code provided) to put the x-axis (representing population) on a log scale.
HINT Use + to add the scale_x_log10() option to the end of the plot.
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBDaGFuZ2UgdGhpcyBwbG90IHRvIHB1dCB0aGUgeC1heGlzIG9uIGEgbG9nIHNjYWxlXG5nZ3Bsb3QoZ2FwbWluZGVyXzE5NTIsIGFlcyh4ID0gcG9wLCB5ID0gbGlmZUV4cCkpICtcbiAgZ2VvbV9wb2ludCgpIiwic29sdXRpb24iOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBDaGFuZ2UgdGhpcyBwbG90IHRvIHB1dCB0aGUgeC1heGlzIG9uIGEgbG9nIHNjYWxlXG5nZ3Bsb3QoZ2FwbWluZGVyXzE5NTIsIGFlcyh4ID0gcG9wLCB5ID0gbGlmZUV4cCkpICtcbiAgZ2VvbV9wb2ludCgpICtcbiAgc2NhbGVfeF9sb2cxMCgpIn0=
Putting the x- and y- axes on a log scale
Suppose you want to create a scatter plot with population on the x-axis and GDP per capita on the y-axis. Both population and GDP per-capita are better represented with log scales, since they vary over many orders of magnitude.
Create a scatter plot with population (pop) on the x-axis and GDP per capita (gdpPercap) on the y-axis. Put both the x- and y- axes on a log scale.
HINT After specifying the scatter plot, you’ll need to add not only scale_x_log10(), but also scale_y_log10().
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBTY2F0dGVyIHBsb3QgY29tcGFyaW5nIHBvcCBhbmQgZ2RwUGVyY2FwLCB3aXRoIGJvdGggYXhlcyBvbiBhIGxvZyBzY2FsZSIsInNvbHV0aW9uIjoibGlicmFyeShnYXBtaW5kZXIpXG5saWJyYXJ5KGRwbHlyKVxubGlicmFyeShnZ3Bsb3QyKVxuXG5nYXBtaW5kZXJfMTk1MiA8LSBnYXBtaW5kZXIgJT4lXG4gIGZpbHRlcih5ZWFyID09IDE5NTIpXG5cbiMgU2NhdHRlciBwbG90IGNvbXBhcmluZyBwb3AgYW5kIGdkcFBlcmNhcCwgd2l0aCBib3RoIGF4ZXMgb24gYSBsb2cgc2NhbGVcbmdncGxvdChnYXBtaW5kZXJfMTk1MiwgYWVzKHggPSBwb3AsIHkgPSBnZHBQZXJjYXApKSArXG4gIGdlb21fcG9pbnQoKSArXG4gIHNjYWxlX3hfbG9nMTAoKSArXG4gIHNjYWxlX3lfbG9nMTAoKSJ9
Adding color to a scatter plot
In this lesson you learned how to use the color aesthetic, which can be used to show which continent each point in a scatter plot represents.
Create a scatter plot with population (pop) on the x-axis, life expectancy (lifeExp) on the y-axis, and with continent (continent) represented by the color of the points. Put the x-axis on a log scale.
HINT In your aes() call, you’ll need to specify x = pop, y = lifeExp, and color = continent. Don’t forget scale_x_log10() at the end.
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBTY2F0dGVyIHBsb3QgY29tcGFyaW5nIHBvcCBhbmQgbGlmZUV4cCwgd2l0aCBjb2xvciByZXByZXNlbnRpbmcgY29udGluZW50Iiwic29sdXRpb24iOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBTY2F0dGVyIHBsb3QgY29tcGFyaW5nIHBvcCBhbmQgbGlmZUV4cCwgd2l0aCBjb2xvciByZXByZXNlbnRpbmcgY29udGluZW50XG5nZ3Bsb3QoZ2FwbWluZGVyXzE5NTIsIGFlcyh4ID0gcG9wLCB5ID0gbGlmZUV4cCwgY29sb3IgPSBjb250aW5lbnQpKSArXG4gIGdlb21fcG9pbnQoKSArXG4gIHNjYWxlX3hfbG9nMTAoKSJ9
Adding size and color to a plot
In the last exercise, you created a scatter plot communicating information about each country’s population, life expectancy, and continent. Now you’ll use the size of the points to communicate even more.
Modify the scatter plot so that the size of the points represents each country’s GDP per capita (gdpPercap).
HINT You need to add size = to the aes() within ggplot().
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBBZGQgdGhlIHNpemUgYWVzdGhldGljIHRvIHJlcHJlc2VudCBhIGNvdW50cnkncyBnZHBQZXJjYXBcbmdncGxvdChnYXBtaW5kZXJfMTk1MiwgYWVzKHggPSBwb3AsIHkgPSBsaWZlRXhwLCBjb2xvciA9IGNvbnRpbmVudCkpICtcbiAgZ2VvbV9wb2ludCgpICtcbiAgc2NhbGVfeF9sb2cxMCgpIiwic29sdXRpb24iOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBBZGQgdGhlIHNpemUgYWVzdGhldGljIHRvIHJlcHJlc2VudCBhIGNvdW50cnkncyBnZHBQZXJjYXBcbmdncGxvdChnYXBtaW5kZXJfMTk1MiwgYWVzKHggPSBwb3AsIHkgPSBsaWZlRXhwLCBjb2xvciA9IGNvbnRpbmVudCwgc2l6ZSA9IGdkcFBlcmNhcCkpICtcbiAgZ2VvbV9wb2ludCgpICtcbiAgc2NhbGVfeF9sb2cxMCgpIn0=
Creating a subgraph for each continent
You’ve learned to use faceting to divide a graph into subplots based on one of its variables, such as the continent.
Create a scatter plot of gapminder_1952 with the x-axis representing population (pop), the y-axis representing life expectancy (lifeExp), and faceted to have one subplot per continent (continent). Put the x-axis on a log scale.
HINT To facet the graph, add facet_wrap(~ continent) to the end of the code, and don’t forget to add scale_x_log10() as well.
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbmdhcG1pbmRlcl8xOTUyIDwtIGdhcG1pbmRlciAlPiVcbiAgZmlsdGVyKHllYXIgPT0gMTk1MilcblxuIyBTY2F0dGVyIHBsb3QgY29tcGFyaW5nIHBvcCBhbmQgbGlmZUV4cCwgZmFjZXRlZCBieSBjb250aW5lbnQiLCJzb2x1dGlvbiI6ImxpYnJhcnkoZ2FwbWluZGVyKVxubGlicmFyeShkcGx5cilcbmxpYnJhcnkoZ2dwbG90MilcblxuZ2FwbWluZGVyXzE5NTIgPC0gZ2FwbWluZGVyICU+JVxuICBmaWx0ZXIoeWVhciA9PSAxOTUyKVxuXG4jIFNjYXR0ZXIgcGxvdCBjb21wYXJpbmcgcG9wIGFuZCBsaWZlRXhwLCBmYWNldGVkIGJ5IGNvbnRpbmVudFxuZ2dwbG90KGdhcG1pbmRlcl8xOTUyLCBhZXMoeCA9IHBvcCwgeSA9IGxpZmVFeHApKSArXG4gIGdlb21fcG9pbnQoKSArXG4gIHNjYWxlX3hfbG9nMTAoKSArXG4gIGZhY2V0X3dyYXAofiBjb250aW5lbnQpIn0=
Faceting by year
All of the graphs in this chapter have been visualizing statistics within one year. Now that you’re able to use faceting, however, you can create a graph showing all the country-level data from 1952 to 2007, to understand how global statistics have changed over time.
Create a scatter plot of the gapminder data: Put GDP per capita (gdpPercap) on the x-axis and life expectancy (lifeExp) on the y-axis, with continent (continent) represented by color and population (pop) represented by size. Put the x-axis on a log scale Facet by the year variable
HINT You’ll need to add facet_wrap(~ year) to the end of the plot.
eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbiMgU2NhdHRlciBwbG90IGNvbXBhcmluZyBnZHBQZXJjYXAgYW5kIGxpZmVFeHAsIHdpdGggY29sb3IgcmVwcmVzZW50aW5nIGNvbnRpbmVudFxuIyBhbmQgc2l6ZSByZXByZXNlbnRpbmcgcG9wdWxhdGlvbiwgZmFjZXRlZCBieSB5ZWFyIiwic29sdXRpb24iOiJsaWJyYXJ5KGdhcG1pbmRlcilcbmxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KGdncGxvdDIpXG5cbiMgU2NhdHRlciBwbG90IGNvbXBhcmluZyBnZHBQZXJjYXAgYW5kIGxpZmVFeHAsIHdpdGggY29sb3IgcmVwcmVzZW50aW5nIGNvbnRpbmVudFxuIyBhbmQgc2l6ZSByZXByZXNlbnRpbmcgcG9wdWxhdGlvbiwgZmFjZXRlZCBieSB5ZWFyXG5nZ3Bsb3QoZ2FwbWluZGVyLCBhZXMoeCA9IGdkcFBlcmNhcCwgeSA9IGxpZmVFeHAsIGNvbG9yID0gY29udGluZW50LCBzaXplID0gcG9wKSkgK1xuICBnZW9tX3BvaW50KCkgK1xuICBzY2FsZV94X2xvZzEwKCkgK1xuICBmYWNldF93cmFwKH4geWVhcikifQ==