A line plot is useful for visualizing trends over time. In this exercise, you’ll examine how the median GDP per capita has changed over time.
Use group_by() and summarize() to find the median GDP per capita within each year, calling the output column medianGdpPercap. Use the assignment operator <- to save it to a dataset called by_year. Use the by_year dataset to create a line plot showing the change in median GDP per capita over time. Be sure to use expand_limits(y = 0) to include 0 on the y-axis.
HINT A line plot is created with the geom_line() layer, and has two aesthetics: x and y. You need to set the x-axis to be year.
In the last exercise you used a line plot to visualize the increase in median GDP per capita over time. Now you’ll examine the change within each continent.
Use group_by() and summarize() to find the median GDP per capita within each year and continent, calling the output column medianGdpPercap. Use the assignment operator <- to save it to a dataset called by_year_continent. Use the by_year_continent dataset to create a line plot showing the change in median GDP per capita over time, with color representing continent. Be sure to use expand_limits(y = 0) to include 0 on the y-axis.
HINT This is similar to the last exercise, but you’ll need to group_by() both year and continent, and you’ll need to add a third aesthetic of color = continent to the graph.
A bar plot is useful for visualizing summary statistics, such as the median GDP in each continent.
Use group_by() and summarize() to find the median GDP per capita within each continent in the year 1952, calling the output column medianGdpPercap. Use the assignment operator <- to save it to a dataset called by_continent. Use the by_continent dataset to create a bar plot showing the median GDP per capita in each continent.
HINT A bar plot is created with the geom_col() layer, and it has two aesthetics: x (the category you’re comparing) and y (the statistic controlling the heights of the bars).
You’ve created a plot where each bar represents one continent, showing the median GDP per capita for each. But the x-axis of the bar plot doesn’t have to be the continent: you can instead create a bar plot where each bar represents a country.
In this exercise, you’ll create a bar plot comparing the GDP per capita between the two countries in the Oceania continent (Australia and New Zealand).
Filter for observations in the Oceania continent in the year 1952. Save this as oceania_1952. Use the oceania_1952 dataset to create a bar plot, with country on the x-axis and gdpPercap on the y-axis.
HINT Remember that when you filter for Oceania, you need quotes around the word: continent == “Oceania”.
A histogram is useful for examining the distribution of a numeric variable. In this exercise, you’ll create a histogram showing the distribution of country populations in the year 1952.
Use the gapminder_1952 dataset (code for generating that dataset is provided) to create a histogram of country population (pop) in the year 1952.
HINT You can specify that you’re creating a histogram by adding the geom_histogram() layer. It has only one aesthetic: x.
In the last exercise you created a histogram of populations across countries. You might have noticed that there were several countries with a much higher population than others, which causes the distribution to be very skewed, with most of the distribution crammed into a small part of the graph. (Consider that it’s hard to tell the median or the minimum population from that histogram).
To make the histogram more informative, you can try putting the x-axis on a log scale.
Use the gapminder_1952 dataset (code is provided) to create a histogram of country population (pop) in the year 1952, putting the x-axis on a log scale with scale_x_log10().
HINT You’ll use the same code that you did in the last exercise, but add scale_x_log10().
A boxplot is useful for comparing a distribution of values across several groups. In this exercise, you’ll examine the distribution of GDP per capita by continent. Since GDP per capita varies across several orders of magnitude, you’ll need to put the y-axis on a log scale.
Use the gapminder_1952 dataset (code is provided) to create a boxplot comparing GDP per capita (gdpPercap) among continents. Put the y-axis on a log scale with scale_y_log10().
HINT A boxplot is created with the geom_boxplot() layer and has two aesthetics: x (the groups that you’re comparing) and y (the value whose distribution you’re considering).
There are many other options for customizing a ggplot2 graph, which you can learn about in other DataCamp courses. You can also learn about them from online resources, which is an important skill to develop.
As the final exercise in this course, you’ll practice looking up ggplot2 instructions by completing a task we haven’t shown you how to do.
Add a title to the graph: Comparing GDP per capita across continents. Use a search engine, such as Google or Bing, to learn how to do so.
HINT Try searching add title ggplot2 on your favorite search engine.