Line Charts in ggplot2: The Complete Guide

Visualizing Growth Trends and Grouped Time-Series

Author

Abdullah Al Shamim

Published

February 8, 2026

Introduction

A Line Chart is one of the most powerful tools in data science for representing information as a series of data points connected by straight segments. It is the gold standard for visualizing trends over time, such as growth curves or financial fluctuations.


1. Environment Setup

We will use the tidyverse suite and the built-in Orange dataset. This dataset tracks the growth (circumference) of five different orange trees over several years (age).

Code
library(tidyverse)

# Loading the built-in dataset
data("Orange")

# Previewing the column names and structure
glimpse(Orange)
Rows: 35
Columns: 3
$ Tree          <ord> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3,…
$ age           <dbl> 118, 484, 664, 1004, 1231, 1372, 1582, 118, 484, 664, 10…
$ circumference <dbl> 30, 58, 87, 115, 120, 142, 145, 33, 69, 111, 156, 172, 2…

2. Building the Line Chart (Step-by-Step)

Step 1: Mapping the Skeleton

Every great line chart begins with the individual observations. We use geom_point() to see where our data sits on the Cartesian plane.

Code
Orange %>%
  ggplot(aes(x = age, y = circumference)) +
  geom_point()

Step 2: Distinguishing Groups (Color)

Since we have 5 different trees, plotting them without grouping creates a confusing vertical stack. We map color = Tree to differentiate them.

Code
Orange %>%
  ggplot(aes(x = age, y = circumference, color = Tree)) +
  geom_point(size = 3, alpha = 0.6)

Step 3: Connecting the Dots

By adding geom_line(), we transform the scatter plot into a trend analysis.

Code
# Combining points and lines for professional clarity
Orange %>%
  ggplot(aes(x = age, y = circumference, color = Tree)) +
  geom_point(size = 3, alpha = 0.6) +
  geom_line(linewidth = 1) +
  theme_minimal()


3. Data Refinement: Factors and Labels

By default, R may not order the legend numerically. We use mutate() to fix the factor levels and labs() to add professional titles.

Code
Orange %>%
  # Ensuring the Tree legend is ordered 1, 2, 3, 4, 5
  mutate(Tree = factor(Tree, levels = c("1", "2", "3", "4", "5"))) %>%
  ggplot(aes(x = age, y = circumference, color = Tree)) +
  geom_point(size = 4, alpha = 0.5) +
  geom_line(linewidth = 1) +
  labs(title = "Orange Tree Growth Analysis",
       subtitle = "Circumference as a Function of Age (Days)",
       x = "Tree Age (Days)",
       y = "Tree Circumference (mm)",
       color = "Tree Number",
       caption = "Data Source: R Datasets | Prepared by Abdullah Al Shamim") +
  theme_bw()


4. Advanced Data Control: Filtering

Often, you only want to compare specific trees or exclude outliers. We can pipe a filter() function directly into our visualization.

Code
# Focus only on trees 1, 2, and 5
Orange %>%
  mutate(Tree = factor(Tree, levels = c("1", "2", "3", "4", "5"))) %>%
  filter(Tree %in% c(1, 2, 5)) %>%
  ggplot(aes(x = age, y = circumference, color = Tree)) +
  geom_point(size = 3) +
  geom_line() +
  labs(title = "Growth Comparison: Selected Trees") +
  theme_test()

Code
# Plotting all trees except Tree 4
Orange %>%
  mutate(Tree = factor(Tree, levels = c("1", "2", "3", "4", "5"))) %>%
  filter(Tree != 4) %>%
  ggplot(aes(x = age, y = circumference, color = Tree)) +
  geom_point(size = 3) +
  geom_line() +
  labs(title = "Growth Comparison (Excluding Tree 4)") +
  theme_test()


Systemic Summary Toolkit

Function Role Why use it?
geom_point() Observation Layer Highlights exact data points.
geom_line() Trend Layer Connects observations to show growth.
color = Tree Grouping Separates multiple lines in one plot.
mutate() Factor Control Ensures the legend is correctly ordered.
filter() Subset Focus Zooms in on relevant information.