Program 3

Author

Manoj

Implement an R function to generate a line graph depicting the trend of a time-series dataset, with separate lines for each group, utilizing ggplot2’s group aesthetic.

Introduction

This document demonstrates how to create a time-series line graph using the built-in AirPassengers dataset in R.

The dataset contains monthly airline passenger counts from 1949 to 1960. We will use ggplot2 to visualize trends, with separate lines for each year.

Step 1: Load necessary libraries

library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.1.3
library(dplyr)
Warning: package 'dplyr' was built under R version 4.1.3

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)
Warning: package 'tidyr' was built under R version 4.1.3

Step 2Load the Built-in AirPassengers Dataset

The AirPassengers dataset is a time series object in R.

We first convert it into a dataframe to use it with ggplot2.

  • Date: Represents the month and year (from January 1949 to December 1960).

  • Passengers: Monthly airline passenger counts.

  • Year: Extracted year from the date column, which will be used to group the data.

# Convert time-series data to a dataframe
data <- data.frame(
  Date = seq(as.Date("1949-01-01"), by = "month", length.out = length(AirPassengers)),
  Passengers = as.numeric(AirPassengers),
  Year = as.factor(format(seq(as.Date("1949-01-01"), by = "month", length.out = length(AirPassengers)), "%Y"))
)

# Display first few rows
head(data, n=20)
         Date Passengers Year
1  1949-01-01        112 1949
2  1949-02-01        118 1949
3  1949-03-01        132 1949
4  1949-04-01        129 1949
5  1949-05-01        121 1949
6  1949-06-01        135 1949
7  1949-07-01        148 1949
8  1949-08-01        148 1949
9  1949-09-01        136 1949
10 1949-10-01        119 1949
11 1949-11-01        104 1949
12 1949-12-01        118 1949
13 1950-01-01        115 1950
14 1950-02-01        126 1950
15 1950-03-01        141 1950
16 1950-04-01        135 1950
17 1950-05-01        125 1950
18 1950-06-01        149 1950
19 1950-07-01        170 1950
20 1950-08-01        170 1950

Step 3: Define a Function for Time-Series Line Graph

We define a function to create a time-series line graph where:

  • The x-axis represents time (Date).

  • The y-axis represents the number of passengers (Passengers).

  • Each year has a separate line to compare trends.

Function Inputs

  1. data – The dataset containing time-series data.

  2. x_col – The column representing time (Date).

  3. y_col – The column representing values (Passengers).

  4. group_col – The categorical variable for grouping (Year).

  5. title – Custom plot title.

Features of the Line Graph

  • Group-based Visualization:
  1. Each year has a distinct line color.

  2. The group aesthetic ensures lines are drawn separately for each year.

  • geom_line(size = 1.2)
  1. Adds a smooth line for trend analysis.
  • geom_point(size = 2)
  1. Highlights individual data points.
  • theme_minimal() & theme(legend.position = “top”)
  1. Enhances readability with a clean layout.

  2. Moves legend to the top for better visualization.

# Function to plot time-series trend
plot_time_series <- function(data, x_col, y_col, group_col, title="Air Passenger Trends") {
  ggplot(data, aes_string(x = x_col, y = y_col, color = group_col, group = group_col)) +
    geom_line(size = 1.2) +  # Line graph
    geom_point(size = 2) +   # Add points for clarity
    labs(title = title,
         x = "Year",
         y = "Number of Passengers",
         color = "Year") +  # Legend title
    theme_minimal() +
    theme(legend.position = "top")
}

# Call the function
plot_time_series(data, "Date", "Passengers", "Year", "Trend of Airline Passengers Over Time")
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
i Please use tidy evaluation ideoms with `aes()`
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
i Please use `linewidth` instead.