Motivation

Air quality has been a really hot topic recently. This project aimed on analyzing the correlation between two most familiar harzard gases — SO2 and Ozone. Since the source of both of the gases consists of the waste emission of mobiles; we expect to understand the correlation between the concentration of this two kind of gases so that we could learn if they come from the same compound in the car waste or not. On the other hand, we want to anaylze the correlation between the correlation of SO2 with the number of car accident; we want to learn if the hazard gases could contribute to the performance of driver and therefore cauce accidents.

Data description

Air quality data collected at outdoor monitors across the United States, Puerto Rico, and the U.S. Virgin Islands. The data comes primarily from the AOS data base. (Among which we chose the data of Ozone and SO2.) The data is downloaded in Air_Quality and Car_Accident.

Library used

packs <- c("png","ggplot2","dplyr","readr","lubridate","ggmap",
           "plotly","shiny", "stringr","readxl","grid","gridExtra")
lapply(packs, library, character.only = TRUE)

Load data

Create small subset for debuging

Load data

## [1] 10000     5

Understand the Data

## # A tibble: 8 × 2
##                                        Method_Name `n()`
##                                              <chr> <int>
## 1        Chemiluminescence API Model 265E and T265    37
## 2                               Ecotech Serinus 10    35
## 3                                     ULTRA VIOLET  4768
## 4                        Ultra Violet 2B Model 202     7
## 5                          ULTRA VIOLET ABSORPTION  4896
## 6                           ULTRAVIOLET ABSORPTION    52
## 7                   ULTRAVIOLET RADIATION ABSORBTN    68
## 8 UV absorption photometry/UV 2B model 202 and 205   137
## # A tibble: 8 × 2
##                                                     `Method Name`
##                                                             <chr>
## 1        Instrumental - Chemiluminescence API Model 265E and T265
## 2                               Instrumental - Ecotech Serinus 10
## 3                                     INSTRUMENTAL - ULTRA VIOLET
## 4                        Instrumental - Ultra Violet 2B Model 202
## 5                          INSTRUMENTAL - ULTRA VIOLET ABSORPTION
## 6                           INSTRUMENTAL - ULTRAVIOLET ABSORPTION
## 7                   INSTRUMENTAL - ULTRAVIOLET RADIATION ABSORBTN
## 8 Instrumental - UV absorption photometry/UV 2B model 202 and 205
## # ... with 1 more variables: mean_measure <dbl>

The data of sample measurement collected by different Method shows little difference.

Data analysis

Time vs. Sample Measurement

The plot showed that the amount of Ozone are generally richer at afternoon(10-17)

Geographical Distribution of the hazard gases

Map shows that the Ozone are rich in the east and west coast of the United States usually contains higher amount of Ozone. And it is clear that the places covered with vegetation has more concentrated and wider covarage of Ozone. Especially around the lake area and coastal area. Map also shows that concentrated SO2 are distributed in the Northeast part and south west part of the United States.

SO2 vs. Ozone

From this plot we can see that the measure of Ozeon and SO2 are completely random distributed, which means that there is no correlation nor causation between these variables.

Join With the car accident data in Atlantic city

Conclusion

In this report, several relationships between measurement of Ozone, SO2, Method Type, Method Name, Date, Time and car accidents are discussed with plots. The distribution of Ozone showed consistance with the population distribution — when there are dense population, it tends to high concentration of Ozone. However, the SO2 concentration shows relative low dependence on the population distribution.