A Line Chart is one of the most powerful tools in data science for representing information as a series of data points connected by straight segments. It is the gold standard for visualizing trends over time, such as growth curves or financial fluctuations.
1. Environment Setup
We will use the tidyverse suite and the built-in Orange dataset. This dataset tracks the growth (circumference) of five different orange trees over several years (age).
Code
library(tidyverse)# Loading the built-in datasetdata("Orange")# Previewing the column names and structureglimpse(Orange)
Every great line chart begins with the individual observations. We use geom_point() to see where our data sits on the Cartesian plane.
Code
Orange %>%ggplot(aes(x = age, y = circumference)) +geom_point()
Step 2: Distinguishing Groups (Color)
Since we have 5 different trees, plotting them without grouping creates a confusing vertical stack. We map color = Tree to differentiate them.
Code
Orange %>%ggplot(aes(x = age, y = circumference, color = Tree)) +geom_point(size =3, alpha =0.6)
Step 3: Connecting the Dots
By adding geom_line(), we transform the scatter plot into a trend analysis.
Code
# Combining points and lines for professional clarityOrange %>%ggplot(aes(x = age, y = circumference, color = Tree)) +geom_point(size =3, alpha =0.6) +geom_line(linewidth =1) +theme_minimal()
3. Data Refinement: Factors and Labels
By default, R may not order the legend numerically. We use mutate() to fix the factor levels and labs() to add professional titles.
Code
Orange %>%# Ensuring the Tree legend is ordered 1, 2, 3, 4, 5mutate(Tree =factor(Tree, levels =c("1", "2", "3", "4", "5"))) %>%ggplot(aes(x = age, y = circumference, color = Tree)) +geom_point(size =4, alpha =0.5) +geom_line(linewidth =1) +labs(title ="Orange Tree Growth Analysis",subtitle ="Circumference as a Function of Age (Days)",x ="Tree Age (Days)",y ="Tree Circumference (mm)",color ="Tree Number",caption ="Data Source: R Datasets | Prepared by Abdullah Al Shamim") +theme_bw()
4. Advanced Data Control: Filtering
Often, you only want to compare specific trees or exclude outliers. We can pipe a filter() function directly into our visualization.