library(ggplot2)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(tidyr)Implement an R function to generate a line graph depicting the trend of a time-series data set,with separate lines for each group,utilizing ggplot2’s group aesthetic.
this document demonstrates how to create a time-series line graph using the built-in airPassengers dataset in R. -the data set contains monthly airline passenger counts from 1949 to 1960. -We will convert the time-series object into a dataframe(becoz ggplot2 works best with data frames). - we will visualize passenger trends over time using ggplot2. -we will draw seperate lines for each year using the group aesthetic (and color for easy comparison).
We load: - ggplot2 to create the line plot. -dplyr for optional data handling(filtering,summarising,etc.). -tidyr for optional reshaping (not strictly req here,but commonly used in tidy workflows).
library(ggplot2)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(tidyr)air passengers dataset and convert it to a dataframe.AirPassengers is a time-series (ts) object,not a data frame.
ggplot2 expects data in a tabular structure,where: -each row is one observation -each column is one variable
so we build a data frame with: - Date :a separate of monthly dates from 1949-01 to 1960-12. - Passengers :the numeric values from the time-series. -Year : extracted from the date,used as the grping variable (factor)
#Create a monthly date sequence that matches AirPassengers length
date_seq <- seq(
as.Date("1949-01-01"),
by="month",
length.out=length(AirPassengers)
)
#convert the time-series object into a dataframe for ggplot2
data<-data.frame(
Date=date_seq,
Passengers=as.numeric(AirPassengers),
Year=as.factor(format(date_seq,"%Y"))
)
#display first few rows
head(data,n=20) Date Passengers Year
1 1949-01-01 112 1949
2 1949-02-01 118 1949
3 1949-03-01 132 1949
4 1949-04-01 129 1949
5 1949-05-01 121 1949
6 1949-06-01 135 1949
7 1949-07-01 148 1949
8 1949-08-01 148 1949
9 1949-09-01 136 1949
10 1949-10-01 119 1949
11 1949-11-01 104 1949
12 1949-12-01 118 1949
13 1950-01-01 115 1950
14 1950-02-01 126 1950
15 1950-03-01 141 1950
16 1950-04-01 135 1950
17 1950-05-01 125 1950
18 1950-06-01 149 1950
19 1950-07-01 170 1950
20 1950-08-01 170 1950
Before plotting ,it helps to confirm: -the types of colums(Date, Numeric, factor) -the range f dates -how many months per year we have
str(data)'data.frame': 144 obs. of 3 variables:
$ Date : Date, format: "1949-01-01" "1949-02-01" ...
$ Passengers: num 112 118 132 129 121 135 148 148 136 119 ...
$ Year : Factor w/ 12 levels "1949","1950",..: 1 1 1 1 1 1 1 1 1 1 ...
# check the earliest and latest dates
range(data$Date)[1] "1949-01-01" "1960-12-01"
# How many months per year?(Should be 12 for each year)
table(data$Year)
1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
12 12 12 12 12 12 12 12 12 12 12 12
a func helps us resue the same plotting logic for other time-series datasets later. Instead of rewriting the plot code again nd agai,we write it once and call it with different inputs.
1.data: the dataframe with timeseries values. 2.x_col:name of the time column(eg:“Date”) 3.y_col:name of the numeric column(e.g:“Passengers”) 4.group_col:name of the grping column(e.g:“Year”) 5.title:custom plot title.
plot_time_series <- function(data,x_col,y_col,group_col,title ="Air Passenger Trends") {
ggplot(
data,
aes_string(
x=x_col,
y=y_col,
color=group_col,
group=group_col
)
)+
geom_line(size=1.2)+
geom_point(size=2)+
labs(
title=title,
x="Date",
y="Number of Passengers",
color= "Year"
)+
theme_minimal()+
theme(legend.position="top")
}Here we use the fuction we created.
we pass:
-"Date" as the time variable -"Passengers" as the values to plot -"Year" as the grouping variable
plot_time_series(
data,
"Date",
"Passengers",
"Year",
"Trend of Airline Passengers Over Time"
)Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.