GGPLOT2 - A Complete Step-by-Step Guide

From Basic Layers to Multi-Dimensional Visuals

Author

Abdullah Al Shamim

Published

February 13, 2026

1. The Grammar of Graphics

ggplot2 operates based on several layers. To create a complete graph, the following components are required:

  1. Data: Your actual dataset.
  2. Mapping (Aesthetics): Which variables map to which axes (x, y) or colors.
  3. Geometric Objects (Geoms): How the data will be visualized (points, lines, or bars).
  4. Statistics: Statistical calculations based on the data (e.g., Mean, Median).
  5. Facets: Dividing data into smaller panels.
  6. Themes: The style or decoration of the graph.
  7. Labs: Titles and labels.

Required Libraries

Code
# install.packages("tidyverse")
library(tidyverse)

A Quick Look at the Data

Code
# data()
# ?women 
head(women)
  height weight
1     58    115
2     59    117
3     60    120
4     61    123
5     62    126
6     63    129

2. Line Graphs

Type - 1.A (Basic Mapping)

Here, we have only performed the data and axis mapping. This creates the canvas but no visual marks.

Code
ggplot(data = women, 
       mapping = aes(x = weight, 
                     y = height))

Now, let’s add lines and points to it:

Code
ggplot(data = women, 
       mapping = aes(x = weight, 
                     y = height)) +
  geom_point() +
  geom_line()

Type - 1.B (Shortened Syntax)

Code
ggplot(women, aes(weight, height)) +
  geom_point() +
  geom_line()

Type - 1.C (Using Pipes %>%)

Code
women %>% 
  ggplot(aes(weight, height)) +
  geom_point() +
  geom_line()

Customization

Line Graphs are ideal for showing changes over a continuous scale.

Code
women %>% 
  ggplot(aes(weight, height)) +
  geom_point(size = 3) +
  geom_line(color = "red")

Code
# Modern method using Piping
women %>% 
  ggplot(aes(x = weight, y = height)) +
  geom_point(size = 3, color = "darkblue") + # Adding points
  geom_line(color = "red", linewidth = 1) +  # Adding lines
  labs(title = "Height vs Weight Relationship",
       x = "Weight", y = "Height") +
  theme_minimal()

Tip: Always use geom_point() alongside geom_line(); this helps in understanding the exact position of each data point.


3. Box Plots: Data Distribution

Boxplots tell us whether there are outliers in the data and how spread out the data is.

Code
# view(chickwts)
names(chickwts)
[1] "weight" "feed"  

Step 1: Basic Boxplot

Code
chickwts %>% 
  ggplot(aes(weight, feed)) + 
  geom_boxplot()

Step 2: Adding Color and Transparency

Code
chickwts %>% 
  ggplot(aes(weight, feed, 
             fill = feed)) + 
  geom_boxplot()

Code
chickwts %>% 
  ggplot(aes(weight, feed, 
             fill = feed)) + 
  geom_boxplot(alpha = 0.6) # Increasing color transparency

Step 3: Adding Themes and Labels

Code
chickwts %>% 
  ggplot(aes(weight, feed, fill = feed)) + 
  geom_boxplot(alpha = 0.6) +
  theme_test() +
  labs(x = "Chicken Weight",
       y = "Chicken Feeds")

Code
chickwts %>% 
  ggplot(aes(x = feed, y = weight, fill = feed)) + 
  geom_boxplot(alpha = 0.6) + # Increasing color transparency
  theme_test() +
  labs(title = "Chicken Weight by Feed Type",
       x = "Feed Type", y = "Weight") +
  theme(legend.position = "none") # Separate legend is not required


4. Bar Chart: Category Counting

Bar charts are generally used to show the count or frequency of categorical data.

A Quick Look at the Data

Code
# View(starwars)
names(starwars)
 [1] "name"       "height"     "mass"       "hair_color" "skin_color"
 [6] "eye_color"  "birth_year" "sex"        "gender"     "homeworld" 
[11] "species"    "films"      "vehicles"   "starships" 

Step 1: Basic Bar Chart

Code
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color)) +
  geom_bar()

Code
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color, fill = gender)) +
  geom_bar()

Code
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color, fill = gender)) +
  geom_bar(alpha = .5) # Increasing color transparency

Code
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color, fill = gender)) +
  geom_bar(alpha = .5) +
  theme_test() +
  labs(title = "Simple Bar-Chart",
       x = "Eye Colour",
       y = "Count")

Step 2: Stacked Bar-Chart

Code
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color, fill = gender)) +
  geom_bar(stat = "count", alpha = .5) +
  theme_test() +
  labs(title = "Stacked Bar-Chart",
       x = "Eye Colour",
       y = "Count") +
  theme(legend.position = "top")

Step 3: Grouped Bar-Chart

Code
# Grouped Bar Chart (Dodge Position)
starwars %>%
  drop_na(eye_color, gender) %>%
  filter(eye_color %in% c("black", "brown", 
                          "blue", "yellow")) %>%
  ggplot(aes(eye_color, fill = gender)) +
  geom_bar(stat = "count", alpha = 0.5,
           position = "dodge",
           show.legend = FALSE) +
  theme_test() +
  labs(title = "Grouped Bar-Chart",
       x = "Eye Colour",
       y = "Count")

Difference: Using position = "stack" places bars on top of each other, while position = "dodge" places them side-by-side to make comparison easier.


5. Scatter Plot: Relationship Analysis

This is the most powerful part of ggplot2. Here, using the gapminder dataset, we will display 4-5 variables simultaneously.

Full Grammar of Graphics Example:

Required Library

Code
# install.packages('gapminder')
library(gapminder)

A Quick Look at the Data

Code
# View(gapminder)
head(gapminder, 8)
# A tibble: 8 × 6
  country     continent  year lifeExp      pop gdpPercap
  <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
1 Afghanistan Asia       1952    28.8  8425333      779.
2 Afghanistan Asia       1957    30.3  9240934      821.
3 Afghanistan Asia       1962    32.0 10267083      853.
4 Afghanistan Asia       1967    34.0 11537966      836.
5 Afghanistan Asia       1972    36.1 13079460      740.
6 Afghanistan Asia       1977    38.4 14880372      786.
7 Afghanistan Asia       1982    39.9 12881816      978.
8 Afghanistan Asia       1987    40.8 13867957      852.

Step 1: Basic Plot

Code
gapminder %>%
  filter(continent %in% c("Asia", "Europe")) %>%
  filter(gdpPercap < 30000) %>%
  ggplot(aes(gdpPercap, lifeExp)) +
  geom_point()

Step 2: Size & Color

Code
gapminder %>%
  filter(continent %in% c("Asia", "Europe")) %>%
  filter(gdpPercap < 30000) %>%
  ggplot(aes(gdpPercap, lifeExp,
             size = pop,
             color = year)) +
  geom_point()

Step 3: Adding Labels

Code
gapminder %>%
  filter(continent %in% c("Asia", "Europe")) %>%
  filter(gdpPercap < 30000) %>%
  ggplot(aes(gdpPercap, lifeExp,
             size = pop,
             color = year)) +
  geom_point() +
  theme_test() +
  labs(title = "Life expectancy explained by GDP per capita",
       x = "GDP per capita",
       y = "Life expectancy")

Step 4: Using Color Palettes

Code
gapminder %>%
  filter(continent %in% c("Asia", "Europe"), gdpPercap < 30000) %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, size = pop, color = year)) +
  geom_point(alpha = 0.6) +
  theme_test() +
  scale_color_viridis_c() + # Beautiful color gradient
  labs(title = "Life Expectancy vs GDP per Capita",
       subtitle = "Faceted by Continent | Size = Population",
       x = "GDP per Capita",
       y = "Life Expectancy")

Step 5: Faceting

Code
gapminder %>%
  filter(continent %in% c("Asia", "Europe"), gdpPercap < 30000) %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, size = pop, color = year)) +
  geom_point(alpha = 0.6) +
  theme_test() +
  scale_color_viridis_c() + # Beautiful color gradient
  labs(title = "Life Expectancy vs GDP per Capita",
       subtitle = "Faceted by Continent | Size = Population",
       x = "GDP per Capita",
       y = "Life Expectancy") +
  facet_wrap(~continent) # Separate panels for Asia and Europe


Your Toolkit at a Glance (Cheat Sheet):

  • aes(): Creates the skeleton of the graph (Mapping).
  • geom_***(): Creates the muscle or visible parts of the graph (Geometry).
  • facet_wrap(): Divides a large graph into small, meaningful sections (Faceting).
  • theme_***(): Makes the graph visually pleasing or professional (Styling).

Excellent guide! You have now mastered all the steps of data visualization through ggplot2, from basic to intermediate levels.