Air Quality Data Analysis NO2

Data from 4 Air Quality Monitoring Stations in Glasgow

Hugh Murphy

Station Location

Clean and Transform Data Methods

  • Load the Excel File (R Studio &PowerBi).
  • Load Libraries required for analysis (if required)
  • Remove redundant columns
  • Ensure columns in correct format
  • Deal with missing data (interpolation)
  • Create new data frames for further analysis
  • Summarise

Missing Data and Interpolation

      Date            Glasgow Anderston Glasgow High Street Glasgow Kerbside
 Min.   :2020-01-01   Min.   : 1.00     Min.   : 3.0        Min.   : 5.00   
 1st Qu.:2021-04-01   1st Qu.:10.00     1st Qu.:12.0        1st Qu.:27.00   
 Median :2022-07-02   Median :19.00     Median :17.0        Median :39.00   
 Mean   :2022-07-02   Mean   :20.72     Mean   :20.3        Mean   :39.13   
 3rd Qu.:2023-10-01   3rd Qu.:28.00     3rd Qu.:26.0        3rd Qu.:50.00   
 Max.   :2024-12-31   Max.   :77.00     Max.   :71.0        Max.   :90.00   
                                                                            
 Glasgow Townhead      year        quarter              month    
 Min.   : 3.00    Min.   :2020   Length:1827        Jan    :155  
 1st Qu.:10.00    1st Qu.:2021   Class :character   Mar    :155  
 Median :13.00    Median :2022   Mode  :character   May    :155  
 Mean   :16.44    Mean   :2022                      Jul    :155  
 3rd Qu.:20.00    3rd Qu.:2023                      Aug    :155  
 Max.   :63.00    Max.   :2024                      Oct    :155  
                                                    (Other):897  
     day           
 Length:1827       
 Class :character  
 Mode  :character  
                   
                   
                   
                   

Examining the Data - Daily

Examining the Data - Quarterly

Examining the Data - Monthly

Examining the Data - Weekday

Measures Taken

Boxplot Pre and Post introduction of the Low Emission Zone (LEZ)

Statistical Significance of the introduction of the Low Emission Zone (LEZ)

    F test to compare two variances

data:  NO2_postLez$Value and NO2_preLez$Value
F = 0.88711, num df = 2319, denom df = 4987, p-value = 0.0008376
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.8277735 0.9515602
sample estimates:
ratio of variances 
         0.8871126 

Power BI Dashboard Example

Power BI Dashboard Example 2

Thanks for listening

Any questions?