Implement an R function to generate a time-series line graph depicting the trend of air pollution (PM2.5 levels) over time for each city group, utilizing ggplot2 group aesthetic.
Introduction
This document demonstrates how to create a time-series line graph using air quality data.
The dataset is obtained from IQAir
It contains PM2.5 values for multiple cities across years (2017–2025)
We convert the dataset into a structured dataframe suitable for visualization
We visualize pollution trends over time using ggplot2
We draw separate lines for each city using group aesthetic.
Step 1: Load Necessary Libraries
We load:
ggplot2 → for visualization
dplyr → for data manipulation
tidyr → for reshaping data
readxl → to import Excel dataset
library(ggplot2)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(tidyr)library(readxl)
Step 2: Load Dataset and Convert to DataFrame
We import the Excel dataset and prepare it for analysis.
data <-read_excel("C:/Users/Yalaguresh/Downloads/air_quality_100.xlsx")head(data)
The dataset is in wide format (years as columns). We convert it into long format for ggplot.
data_long <- data %>%pivot_longer(cols =`2017`:`2025`,names_to ="Year",values_to ="PM25" )# Convert Year to numericdata_long$Year <-as.numeric(data_long$Year)head(data_long)
# A tibble: 6 × 5
Rank City Country Year PM25
<dbl> <chr> <chr> <dbl> <dbl>
1 1 Loni India 2017 41.4
2 1 Loni India 2018 108.
3 1 Loni India 2019 43.2
4 1 Loni India 2020 61.1
5 1 Loni India 2021 84.2
6 1 Loni India 2022 116.
Rank City Country Year
Min. : 1.00 Length:900 Length:900 Min. :2017
1st Qu.: 25.75 Class :character Class :character 1st Qu.:2019
Median : 50.50 Mode :character Mode :character Median :2021
Mean : 50.50 Mean :2021
3rd Qu.: 75.25 3rd Qu.:2023
Max. :100.00 Max. :2025
PM25
Min. : 40.00
1st Qu.: 58.48
Median : 78.40
Mean : 78.98
3rd Qu.: 99.33
Max. :119.90
# Check year rangerange(data_long$Year, na.rm =TRUE)