Implement an R function to generate a line graph depicting the trend of a time-series dataset, with seperate lines for each group, utilizing ggplot2 group aesthetic.
Steps:
Step 1: Loading the necessary Libraries
Step 2: Load the built in dataset AirPassengers and convert it into dataframe
Step 3: Understand the data structure
Step 4: Define a function to create a grouped Time-Series Line Graph
Step 5: Call the function to generate the plot
Discussion, and some question for pondering.
Step 1: Loading the necessary Libraries
We load:
ggplot2: used for creating the line plot
dplyr: for optional data handling (filtering, and summarization)
tidyr: for optional reshaping (it is optioal)
library(ggplot2)library(dplyr)
Warning: package 'dplyr' was built under R version 4.5.2
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(tidyr)
Step 2: Load the built in dataset AirPassengers and convert it into dataframe
AirPassengers is a time series object, not a data frame ggplot2 expects the data in tabular format
each row should be an observation
each column should be one variable
So we build data frame with:
Date: A sequence of monthly dates from 01 Jan 1949 to 30 Dec 1960
Passengers: is numeric values form the time series
Year: Extracted fromt he date, used as grouping varialbe
# Create a monthly date sequence that matches AirPassengers lengthdate_seq =seq(as.Date("1949-01-01"), by="month", length.out=length(AirPassengers))# Convert the time-series object into a dataframe for ggplot2 - number of people travelled per month between 1949 to 1960 month wisedata =data.frame(Date= date_seq, Passengers=as.numeric(AirPassengers),Year=as.factor(format(date_seq, "%Y")) )head(data, n=15)
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
plot_time_series =function(data, x_col, y_col, group_col, title="Air passengers trend analysis from 1949 to 1960")ggplot(data, aes_string(x=x_col,y=y_col,color=group_col, group=group_col ) )+geom_line(size=3)