First, ensure all necessary packages are installed and loaded. We’ll
need ggplot2 for plotting and gapminder for
the dataset.
ggplot2 is a powerful and a flexible R package, designed
under the philosophy of tidy data, that makes it easy to create complex
visualizations from data in a data frame. The syntax is somewhat
different from traditional plotting functions in R (like plot()).
ggplot2 uses a consistent set of principles, which makes it
easy to learn, even if you’re new to it.
Every ggplot2 plot begins with the ggplot() function,
which initializes a plot object. The first argument of
ggplot() is usually a data frame, and the
aes() function is used to define mappings between data
variables and visual properties.
The ggplot grammar, as implemented in the ggplot2 package in R, is a coherent system for describing and building graphs. It is based on the principle that any statistical graphic can be expressed by mapping data to aesthetic attributes (like color, shape, and size) of geometric objects (like points, lines, and bars) in a structured way, allowing for complex and customizable visualizations that are both informative and aesthetically pleasing. This systematized approach enables clear, concise, and consistent representation of data through graphics.
facet_wrap,
facet_grid).coord_cartesian,
coord_polar, coord_flip).theme, theme_minimal()).This list provides a concise overview of the essential components of ggplot2’s grammar of graphics, ready for inclusion in documentation, presentations, or educational materials. ## Quick dataset exploration
This list provides a concise overview of the essential components of ggplot2’s grammar of graphics, ready for inclusion in documentation, presentations, or educational materials. ## Quick dataset exploration
Let’s see the first rows:
## # A tibble: 6 × 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
We can also check the column data types and dataset dimensions:
## Rows: 1,704
## Columns: 6
## $ country <fct> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", …
## $ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, …
## $ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, …
## $ lifeExp <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40.8…
## $ pop <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, 12…
## $ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134, …
We will create a basic scatter plot of GDP per capita vs life expectancy for the year 2007:
Let’s plot now the relationship between GDP per capita and life expectancy over the years grouping by continents:
Change the X-axis to logarithmic scale to improve visualization (useful if your range of values is very wide)
scale_x_log10(): Transforms the x-axis (GDP per capita)
to a logarithmic scale to better handle wide data ranges and improve
readability.ggplot2 allows for extensive customization to make your
plots communicate more effectively. Here’s how to customize the
appearance of a plot:
Next, we will visualize the change in GDP per capita over time for China.
Next, we’ll create a line plot to observe the trend of life expectancy over years for each country in Asia.
Bar plots are useful for comparing quantities corresponding to
different groups. Here we plot the average life expectancy per
continent. We first use dplyr to calculate the mean before
feeding the data to ggplot
Finally, we will create a time series plot showing life expectancy over time in India.
Here’s how to visualize the GDP per capita over time for a specific country, say China:
Facets in ggplot2 are used to split data into multiple small plots based on the values of one or more categorical variables, allowing for easy comparison across groups.
Let’s recover the plot example3_plot from Example 3 and save it in different formats:
If you write the file extension as ‘.pdf’, ggplot will export as PDF
In the same way, we can save it as PNG image by changing the file extension. The same happens with other formats (jpg, tiff, etc.)
This session provided a brief introduction to the
ggplot2 package and demonstrated how to use it to create
various types of plots. Practice these examples and try modifying the
aesthetics and other parameters.