Air Quality Presentation

Author

Yalaguresh G and Bangali Vikas

Introduction

  • Air quality analysis using IQAir dataset
  • 100 cities across years (2017–2025)
  • Focus on PM2.5 pollution

Dataset

  • Source: IQAir
  • Variables:
    • City
    • Country
    • PM2.5 values
  • Multi-year dataset

Objective

  • Analyze pollution trends
  • Compare cities
  • Visualize using ggplot2

Load Data

library(readxl)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)

# Load dataset
data <- read_excel("C:/Users/Yalaguresh/Downloads/air_quality_100.xlsx")

# Convert to long format
data_long <- data %>%
  pivot_longer(
    cols = `2017`:`2025`,
    names_to = "Year",
    values_to = "PM25"
  )

data_long$Year <- as.numeric(data_long$Year)

PM2.5 Distribution

ggplot(data, aes(x=`2025`)) +
  geom_histogram(binwidth=10, fill="red") +
  labs(title="PM2.5 Distribution (2025)")

Top Polluted Cities

top10 <- data %>%
  arrange(desc(`2025`)) %>%
  slice(1:10)

ggplot(top10, aes(x=reorder(City, `2025`), y=`2025`)) +
  geom_bar(stat="identity", fill="blue") +
  coord_flip() +
  labs(title="Top 10 Polluted Cities")

Time-Series Trend

selected <- data_long %>%
  filter(City %in% c("Delhi", "Lahore", "Hotan", "Loni"))

ggplot(selected, aes(x=Year, y=PM25, color=City, group=City)) +
  geom_line(size=1.2) +
  geom_point()
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Key Findings

  • High pollution in India, Pakistan, China
  • PM2.5 levels exceed WHO limits
  • Pollution trends remain high

Conclusion

  • Air pollution is a serious issue
  • Requires strict environmental policies
  • Data visualization helps in understanding trends