Air Quality Data Visualization

Author

Flioria Akesse

# Load dataset
data("airquality")

Plot 1 – Histogram of Temperature

hist(airquality$Temp,
     main="Histogram of Daily Temperature",
     xlab="Temperature (°F)",
     col="skyblue")

Caption: Data source – R built-in airquality dataset.

Plot 2 – Scatterplot of Temperature vs Ozone

plot(airquality$Temp, airquality$Ozone,
     main="Temperature vs Ozone Levels",
     xlab="Temperature (°F)",
     ylab="Ozone (ppb)",
     col="darkgreen",
     pch=19)

Caption: Data source – R built-in airquality dataset.

Plot 3 – Scatterplot of Wind vs Ozone

plot(airquality$Wind, airquality$Ozone,
     main="Wind Speed vs Ozone",
     xlab="Wind Speed (mph)",
     ylab="Ozone (ppb)",
     col="purple",
     pch=19)

Caption: Data source – R built-in airquality dataset.

Plot 4 – Boxplot of Temperature by Month

boxplot(Temp ~ Month,
        data=airquality,
        main="Temperature by Month",
        xlab="Month",
        ylab="Temperature (°F)",
        col="orange")

Caption: Data source – R built-in airquality dataset.

Plot 5 – Histogram of Solar Radiation

hist(airquality$Solar.R,
     main="Distribution of Solar Radiation",
     xlab="Solar Radiation",
     col="gold")

Caption: Data source – R built-in airquality dataset.

Brief Essay

The fifth plot I created is a histogram of Solar.R, which represents solar radiation values in the airquality dataset. This plot shows how the solar radiation values are distributed across the observations. Most values appear to fall in the middle ranges, while fewer observations are found at the very low or very high ends. This helps show the general pattern of solar radiation during the recorded period.

This plot is useful because it allows us to quickly see the spread and concentration of the Solar.R variable. Instead of comparing two variables, this histogram focuses on one variable and helps summarize its distribution. It shows whether the data are clustered, spread out, or unevenly distributed.

To make this plot, I used the hist() function in R and selected the Solar.R variable from the airquality dataset. I added a title with the main argument, an x-axis label with the xlab argument, and a color with the col argument. This modification let me explore a variable that was not emphasized in the earlier plots and better understand the distribution of solar radiation values.