Loading packages
library(readr)
library(dplyr)
library(tidyr)
library(openair) # http://davidcarslaw.github.io/openair/
library(purrr)
library(lubridate)
library(ggplot2)
library(stringr)
library(knitr)
library(xts)
library(zoo)
library(gridExtra)
library(astsa)
library(rvest)
library(fpp2)
First of all I have to check if I will have the basic data to make the analysis. I need air pollution and weather data of the Gijon area. The town hall of Gijon has an open data web portal here https://transparencia.gijon.es/. We can download pollution air data on csv format from year 2000 to 2017 here:
I downloaded 18 csv files with air pollution and weather data of Gijón from years 2000 to 2017. I saved them in the “data” folder. I downloaded two more files from this web, a csv file with the description of the variables and another csv file with information about the measurement stations.
We take a look to the information included in the stations_info.csv file. It includes the stations addresses, longitude, latitude and their IDs and names. All this information, as we will see, is included in the csv files with pollution and weather data too. So, we are not going to use this file anymore.
stations <- read_delim('data/stations_info.csv',
delim = ';',
escape_double = FALSE,
trim_ws = TRUE,
locale = locale(encoding = "ISO-8859-1"),
col_types = cols(.default = "c"))
stations
## # A tibble: 6 x 6
## `"ID;""Título""` `""Dirección""` `""Población""` `""Provincia""`
## <chr> <chr> <chr> <chr>
## 1 "\"1;\"\"Constituci~ "\"\"Avda. Constit~ "\"\"Gijón\"\"" "\"\"Asturias\~
## 2 "\"2;\"\"Argentina\~ "\"\"Avda. Argenti~ "\"\"Gijón\"\"" "\"\"Asturias\~
## 3 "\"3;\"\"H. Felguer~ "\"\"H. Felgueroso~ "\"\"Gijón\"\"" "\"\"Asturias\~
## 4 "\"4;\"\"Castilla\"~ "\"\"Plaza Castill~ "\"\"Gijón\"\"" "\"\"Asturias\~
## 5 "\"10;\"\"Montevil\~ "\"\"Montevil\"\"" "\"\"Gijón\"\"" "\"\"Asturias\~
## 6 "\"11;\"\"Santa Bár~ "\"\"Santa Bárbara~ "\"\"Gijón\"\"" "\"\"Asturias\~
## # ... with 2 more variables: `""latitud""` <chr>, `""longitud""",,` <chr>
We can see on this image the location of each station. http://movil.asturias.es/medioambiente/articulos/ficheros/Informe%20de%20calidad%20del%20aire%20en%20Asturias%202016.pdf
Image source: “Informe de calidad del aire del Principado de Asturias (2016)”.
The air_data_descriptors.csv file contains information about the nature of the elements monitored by the stations. Names, descriptions and units.
variables <- read_csv('data/air_data_descriptors.csv', locale = locale(encoding = "ISO-8859-1"))
variables
## # A tibble: 17 x 4
## Parametro `Descripción Parámetro` TAG Unidad
## <chr> <chr> <chr> <chr>
## 1 BEN Benceno BEN µg/m³
## 2 CO Concentracion de CO CO mg/m³
## 3 DD Direccion del viento DD Grados
## 4 HR Humedad relativa HR %hr
## 5 LL Precipitacion LL l/m²
## 6 MXIL MXileno MXIL µg/m³
## 7 NO Concentracion de NO NO µg/m³
## 8 NO2 Concentracion de NO2 NO2 µg/m³
## 9 O3 Concentracion de Ozono O3 µg/m³
## 10 PM10 Particulas en suspension <10 µg/m³ PM10 µg/m³
## 11 PM25 Particulas en Suspension PM 2,5 PM25 µg/m³
## 12 PRB Presion Atmosferica PRB mb
## 13 RS Radiacion Solar RS W/m²
## 14 SO2 Concentracion de SO2 SO2 µg/m³
## 15 TMP Temperatura Seca TMP ºC
## 16 TOL Tolueno TOL µg/m³
## 17 VV Velocidad del viento VV m/s
In order to import the data from the 18 csv files we list all the files in the object data_files.
data_files <- list.files(path = "data", pattern = "air_data_20*")
Then, we map the function read_csv on this list in order to import every file and finally merge them in a unique dataframe (air_data_0) with reduce(rbind).
air_data_0 <- data_files %>%
map(function(x) {
read_csv(paste0("./data/", x), locale = locale(encoding = "ISO-8859-1"), col_types = cols(.default = "c"))
}) %>%
reduce(rbind)
We take a look to the data set
glimpse(air_data_0)
## Observations: 722,774
## Variables: 22
## $ Estación <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1...
## $ Título <chr> "Estación Avenida Constitución", "Estación...
## $ latitud <chr> "43.529806", "43.529806", "43.529806", "43...
## $ longitud <chr> "-5.673428", "-5.673428", "-5.673428", "-5...
## $ `Fecha Solar (UTC)` <chr> "2000-01-01T00:00:00", "2000-01-01T01:00:0...
## $ SO2 <chr> "23", "29", "40", "50", "39", "39", "40", ...
## $ NO <chr> "89", "73", "53", "46", "35", "26", "27", ...
## $ NO2 <chr> "65", "60", "57", "53", "50", "49", "51", ...
## $ CO <chr> "1.97", "1.61", "1.13", "1.06", "0.95", "0...
## $ PM10 <chr> "53", "63", "56", "58", "50", "50", "57", ...
## $ O3 <chr> "9", "8", "7", "5", "6", "7", "7", "4", "5...
## $ dd <chr> "245", "222", "228", "239", "244", "218", ...
## $ vv <chr> "0.34", "1.06", "0.71", "0.84", "0.89", "0...
## $ TMP <chr> "5.7", "5.4", "5.3", "5.1", "4.6", "4.6", ...
## $ HR <chr> "76", "73", "72", "71", "72", "69", "68", ...
## $ PRB <chr> "1026", "1025", "1025", "1025", "1024", "1...
## $ RS <chr> "33", "33", "33", "33", "33", "33", "33", ...
## $ LL <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0...
## $ BEN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ TOL <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ MXIL <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ PM25 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
We change the names of some variables.
# Variables names changing
air_data_1 <- air_data_0 %>% rename(station = `Estación`,
station_name = `Título`,
date_time_utc = `Fecha Solar (UTC)`,
latitude = latitud,
longitude = longitud)
We imported all the columns as characters in order to avoid problems with the format attributions. So, we have to make now some format variable changes.
We change the date_time_utc format from character to date time.
air_data_1$date_time_utc <- ymd_hms(air_data_1$date_time_utc)
We change the station and station_name formats from character to factor.
air_data_1$station <- as.factor(air_data_1$station)
air_data_1$station_name <- as.factor(air_data_1$station_name)
We create a vector with all the variables we want to be numeric
num <- colnames(air_data_1)[c(3, 4, 6:22)]
We make the conversion of this set of variables to numeric
air_data_1 <- air_data_1 %>% mutate_at(num, as.numeric)
We create a dictionary with an alias for each station in order to add a new variable with more convenient station names
alias_dict <- data.frame(
station = c("1", "2", "3", "4", "10", "11"),
station_alias = c("Constitución", "Argentina", "H. Felgueroso", "Castilla", "Montevil", "Santa Bárbara")
)
We join the alias dictionary to the air_data_1 data frame to add the new variable to the data set.
air_data_1 <- air_data_1 %>% left_join(alias_dict, by = 'station')
We call the summary function to inspect the data main indicators
summary(air_data_1)
## station station_name latitude
## 1 :157727 Estación Avenida Argentina :157798 Min. :43.52
## 10: 74630 Estación Avenida Castilla :157409 1st Qu.:43.53
## 11: 17544 Estación Avenida Constitución :157727 Median :43.54
## 2 :157798 Estación Avenida Hermanos Felgueroso:157666 Mean :43.53
## 3 :157666 Estación de Montevil : 74630 3rd Qu.:43.54
## 4 :157409 Estación Santa Bárbara : 17544 Max. :43.54
##
## longitude date_time_utc SO2
## Min. :-5.699 Min. :2000-01-01 00:00:00 Min. :-9999.00
## 1st Qu.:-5.673 1st Qu.:2005-02-25 05:00:00 1st Qu.: 4.00
## Median :-5.672 Median :2010-02-23 11:00:00 Median : 6.00
## Mean :-5.670 Mean :2009-09-06 07:33:13 Mean : 9.77
## 3rd Qu.:-5.658 3rd Qu.:2014-04-09 06:00:00 3rd Qu.: 11.00
## Max. :-5.646 Max. :2018-01-01 00:00:00 Max. : 2662.00
## NA's :33742
## NO NO2 CO PM10
## Min. :-9999.00 Min. :-9999.00 Min. : 0.00 Min. :-9999.00
## 1st Qu.: 4.40 1st Qu.: 16.00 1st Qu.: 0.22 1st Qu.: 19.00
## Median : 10.00 Median : 28.00 Median : 0.36 Median : 30.00
## Mean : 21.37 Mean : 32.04 Mean : 0.49 Mean : 35.88
## 3rd Qu.: 23.00 3rd Qu.: 45.00 3rd Qu.: 0.59 3rd Qu.: 46.00
## Max. : 1248.00 Max. : 1003.20 Max. :58.20 Max. : 1000.00
## NA's :16989 NA's :16446 NA's :90390 NA's :88598
## O3 dd vv TMP
## Min. :-9999.00 Min. : 0.0 Min. : 0.0 Min. :-40.0
## 1st Qu.: 17.00 1st Qu.: 96.0 1st Qu.: 0.2 1st Qu.: 10.9
## Median : 37.00 Median :159.0 Median : 0.7 Median : 14.7
## Mean : 38.97 Mean :161.8 Mean : 1.0 Mean : 14.6
## 3rd Qu.: 57.00 3rd Qu.:228.0 3rd Qu.: 1.5 3rd Qu.: 18.4
## Max. : 998.00 Max. :360.0 Max. :29.8 Max. : 47.4
## NA's :31417 NA's :494134 NA's :493893 NA's :494151
## HR PRB RS LL
## Min. : 0.0 Min. : 800 Min. : -1.0 Min. : 0.0
## 1st Qu.: 69.0 1st Qu.:1007 1st Qu.: 17.0 1st Qu.: 0.0
## Median : 80.0 Median :1013 Median : 46.0 Median : 0.0
## Mean : 78.3 Mean :1012 Mean : 125.2 Mean : 0.1
## 3rd Qu.: 89.0 3rd Qu.:1018 3rd Qu.: 149.0 3rd Qu.: 0.0
## Max. :123.0 Max. :1282 Max. :1470.0 Max. :24.6
## NA's :494176 NA's :494019 NA's :494273 NA's :494124
## BEN TOL MXIL PM25
## Min. : 0.0 Min. : -0.2 Min. : -0.3 Min. : 0.0
## 1st Qu.: 0.1 1st Qu.: 0.4 1st Qu.: 0.2 1st Qu.: 5.0
## Median : 0.3 Median : 1.0 Median : 0.3 Median : 9.0
## Mean : 0.5 Mean : 2.5 Mean : 1.3 Mean : 11.3
## 3rd Qu.: 0.5 3rd Qu.: 2.5 3rd Qu.: 0.9 3rd Qu.: 15.0
## Max. :22.5 Max. :196.0 Max. :220.0 Max. :947.0
## NA's :629358 NA's :629380 NA's :635123 NA's :554185
## station_alias
## Argentina :157798
## Castilla :157409
## Constitución :157727
## H. Felgueroso:157666
## Montevil : 74630
## Santa Bárbara: 17544
##
There are several variables which minimun values are -9999.
kable(air_data_1 %>% filter(SO2 == -9999))
| station | station_name | latitude | longitude | date_time_utc | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 | station_alias |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 00:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 01:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 02:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 03:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 04:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 05:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 06:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 07:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 08:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 09:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 10:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 11:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 12:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 13:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 14:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 15:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 16:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 17:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 18:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 19:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 20:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 21:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 22:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
| 3 | Estación Avenida Hermanos Felgueroso | 43.53506 | -5.658123 | 2000-01-27 23:00:00 | -9999 | -9999 | -9999 | 0 | -9999 | -9999 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | H. Felgueroso |
They are all from the same day (2000-01-27) and from the same station (‘H. Felgueroso’). All the variables from that day, excepts the ‘CO’ indicator, are equal to ‘-9999’.
So, we replace these values by NAs.
air_data_2 <- air_data_1 %>% mutate(SO2 = replace(SO2, SO2 == -9999, NA),
NO = replace(NO, NO == -9999, NA),
NO2 = replace(NO2, NO2 == -9999, NA),
PM10 = replace(PM10, PM10 == -9999, NA),
O3 = replace(O3, O3 == -9999, NA))
We check again the output of summary.
summary(air_data_2)
## station station_name latitude
## 1 :157727 Estación Avenida Argentina :157798 Min. :43.52
## 10: 74630 Estación Avenida Castilla :157409 1st Qu.:43.53
## 11: 17544 Estación Avenida Constitución :157727 Median :43.54
## 2 :157798 Estación Avenida Hermanos Felgueroso:157666 Mean :43.53
## 3 :157666 Estación de Montevil : 74630 3rd Qu.:43.54
## 4 :157409 Estación Santa Bárbara : 17544 Max. :43.54
##
## longitude date_time_utc SO2
## Min. :-5.699 Min. :2000-01-01 00:00:00 Min. : -2.00
## 1st Qu.:-5.673 1st Qu.:2005-02-25 05:00:00 1st Qu.: 4.00
## Median :-5.672 Median :2010-02-23 11:00:00 Median : 6.00
## Mean :-5.670 Mean :2009-09-06 07:33:13 Mean : 10.12
## 3rd Qu.:-5.658 3rd Qu.:2014-04-09 06:00:00 3rd Qu.: 11.00
## Max. :-5.646 Max. :2018-01-01 00:00:00 Max. :2662.00
## NA's :33766
## NO NO2 CO PM10
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 4.40 1st Qu.: 16.00 1st Qu.: 0.22 1st Qu.: 19.00
## Median : 10.00 Median : 28.00 Median : 0.36 Median : 30.00
## Mean : 21.71 Mean : 32.38 Mean : 0.49 Mean : 36.26
## 3rd Qu.: 23.00 3rd Qu.: 45.00 3rd Qu.: 0.59 3rd Qu.: 46.00
## Max. :1248.00 Max. :1003.20 Max. :58.20 Max. :1000.00
## NA's :17013 NA's :16470 NA's :90390 NA's :88622
## O3 dd vv TMP
## Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. :-40.0
## 1st Qu.: 17.00 1st Qu.: 96.0 1st Qu.: 0.2 1st Qu.: 10.9
## Median : 37.00 Median :159.0 Median : 0.7 Median : 14.7
## Mean : 39.32 Mean :161.8 Mean : 1.0 Mean : 14.6
## 3rd Qu.: 57.00 3rd Qu.:228.0 3rd Qu.: 1.5 3rd Qu.: 18.4
## Max. :998.00 Max. :360.0 Max. :29.8 Max. : 47.4
## NA's :31441 NA's :494134 NA's :493893 NA's :494151
## HR PRB RS LL
## Min. : 0.0 Min. : 800 Min. : -1.0 Min. : 0.0
## 1st Qu.: 69.0 1st Qu.:1007 1st Qu.: 17.0 1st Qu.: 0.0
## Median : 80.0 Median :1013 Median : 46.0 Median : 0.0
## Mean : 78.3 Mean :1012 Mean : 125.2 Mean : 0.1
## 3rd Qu.: 89.0 3rd Qu.:1018 3rd Qu.: 149.0 3rd Qu.: 0.0
## Max. :123.0 Max. :1282 Max. :1470.0 Max. :24.6
## NA's :494176 NA's :494019 NA's :494273 NA's :494124
## BEN TOL MXIL PM25
## Min. : 0.0 Min. : -0.2 Min. : -0.3 Min. : 0.0
## 1st Qu.: 0.1 1st Qu.: 0.4 1st Qu.: 0.2 1st Qu.: 5.0
## Median : 0.3 Median : 1.0 Median : 0.3 Median : 9.0
## Mean : 0.5 Mean : 2.5 Mean : 1.3 Mean : 11.3
## 3rd Qu.: 0.5 3rd Qu.: 2.5 3rd Qu.: 0.9 3rd Qu.: 15.0
## Max. :22.5 Max. :196.0 Max. :220.0 Max. :947.0
## NA's :629358 NA's :629380 NA's :635123 NA's :554185
## station_alias
## Argentina :157798
## Castilla :157409
## Constitución :157727
## H. Felgueroso:157666
## Montevil : 74630
## Santa Bárbara: 17544
##
Some pollutant variables have as minimum negative values. It does not make much sense. We take a look to the data in order to quantify the problem.
30 SO2 observations between 2015-12-25 and 2015-12-28 from the Montevil station:
kable(neg_SO2 <- air_data_2 %>% filter(SO2 < 0))
| station | station_name | latitude | longitude | date_time_utc | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 | station_alias |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 04:00:00 | -1 | 7 | 10 | NA | NA | 1 | 5 | 0.70 | 11.6 | 73 | 1019 | 3 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 05:00:00 | -1 | 6 | 9 | NA | NA | 1 | 184 | 0.70 | 11.4 | 73 | 1019 | 3 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 06:00:00 | -1 | 6 | 7 | NA | NA | 1 | 180 | 0.63 | 11.4 | 73 | 1019 | 3 | 0 | NA | NA | NA | 6 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 07:00:00 | -1 | 6 | 7 | NA | NA | 1 | 177 | 0.30 | 11.0 | 73 | 1019 | 3 | 0 | NA | NA | NA | 7 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 08:00:00 | -1 | 8 | 3 | NA | NA | 1 | 194 | 0.75 | 11.1 | 73 | 1020 | 5 | 0 | NA | NA | NA | 10 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 09:00:00 | -1 | 8 | 9 | NA | NA | 1 | 237 | 0.43 | 12.8 | 68 | 1020 | 81 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 17:00:00 | -1 | 8 | 20 | NA | NA | 1 | 240 | 0.37 | 19.4 | 58 | 1020 | 24 | 0 | NA | NA | NA | 4 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-25 22:00:00 | -1 | 8 | 22 | NA | NA | 1 | 352 | 1.00 | 13.6 | 61 | 1021 | 3 | 0 | NA | NA | NA | 17 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-26 01:00:00 | -1 | 6 | 6 | NA | NA | 1 | 355 | 1.02 | 12.1 | 64 | 1021 | 3 | 0 | NA | NA | NA | 11 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-26 18:00:00 | -1 | 7 | 22 | NA | NA | 1 | 166 | 1.00 | 18.5 | 35 | 1018 | 4 | 0 | NA | NA | NA | 9 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-26 19:00:00 | -1 | 7 | 16 | NA | NA | 1 | 174 | 0.90 | 17.8 | 36 | 1018 | 3 | 0 | NA | NA | NA | 5 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-26 23:00:00 | -1 | 6 | 7 | NA | NA | 1 | 359 | 1.65 | 14.7 | 42 | 1019 | 3 | 0 | NA | NA | NA | 21 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 00:00:00 | -1 | 7 | 11 | NA | NA | 1 | 358 | 1.40 | 14.2 | 42 | 1020 | 3 | 0 | NA | NA | NA | 16 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 02:00:00 | -1 | 6 | 4 | NA | NA | 1 | 353 | 1.43 | 13.2 | 46 | 1019 | 3 | 0 | NA | NA | NA | 10 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 06:00:00 | -1 | 7 | 11 | NA | NA | 1 | 359 | 0.50 | 10.4 | 51 | 1018 | 3 | 0 | NA | NA | NA | 12 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 07:00:00 | -1 | 7 | 15 | NA | NA | 1 | 186 | 0.35 | 9.9 | 54 | 1018 | 3 | 0 | NA | NA | NA | 11 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 08:00:00 | -1 | 10 | 15 | NA | NA | 1 | 182 | 0.55 | 10.1 | 53 | 1018 | 4 | 0 | NA | NA | NA | 11 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 09:00:00 | -1 | 17 | 25 | NA | NA | 1 | 178 | 0.35 | 9.4 | 56 | 1018 | 33 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 10:00:00 | -1 | 23 | 28 | NA | NA | 1 | 185 | 0.30 | 12.8 | 49 | 1018 | 191 | 0 | NA | NA | NA | 5 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 12:00:00 | -1 | 24 | 25 | NA | NA | 1 | 356 | 0.40 | 16.5 | 49 | 1017 | 229 | 0 | NA | NA | NA | 10 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 20:00:00 | -1 | 8 | 21 | NA | NA | 1 | 181 | 1.75 | 18.4 | 26 | 1011 | 3 | 0 | NA | NA | NA | 6 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 22:00:00 | -1 | 6 | 13 | NA | NA | 1 | 175 | 2.35 | 18.9 | 25 | 1011 | 3 | 0 | NA | NA | NA | 28 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-27 23:00:00 | -2 | 6 | 2 | NA | NA | 1 | 177 | 3.85 | 20.0 | 25 | 1010 | 3 | 0 | NA | NA | NA | 13 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 00:00:00 | -2 | 6 | 2 | NA | NA | 1 | 178 | 4.15 | 19.7 | 26 | 1010 | 3 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 01:00:00 | -2 | 6 | 2 | NA | NA | 1 | 177 | 2.80 | 19.0 | 29 | 1009 | 3 | 0 | NA | NA | NA | 16 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 02:00:00 | -2 | 6 | 2 | NA | NA | 1 | 175 | 2.27 | 18.5 | 32 | 1009 | 3 | 0 | NA | NA | NA | 7 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 03:00:00 | -1 | 6 | 2 | NA | NA | 1 | 158 | 1.80 | 18.2 | 33 | 1009 | 3 | 0 | NA | NA | NA | 9 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 04:00:00 | -2 | 6 | 2 | NA | NA | 1 | 156 | 2.07 | 17.4 | 36 | 1009 | 3 | 0 | NA | NA | NA | 11 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 05:00:00 | -1 | 6 | 2 | NA | NA | 1 | 194 | 2.20 | 19.0 | 35 | 1008 | 3 | 0 | NA | NA | NA | 8 | Montevil |
| 10 | Estación de Montevil | 43.51732 | -5.672499 | 2015-12-28 06:00:00 | -1 | 6 | 2 | NA | NA | 1 | 197 | 3.13 | 19.2 | 37 | 1008 | 3 | 0 | NA | NA | NA | 18 | Montevil |
2 RS observations from the Constitucion station:
kable(neg_RS <- air_data_2 %>% filter(RS < 0))
| station | station_name | latitude | longitude | date_time_utc | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 | station_alias |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2005-07-09 01:00:00 | 2 | 15 | 21 | 0.10 | 18 | 49 | 25 | 1.09 | 17.7 | 88 | 1012 | -1 | 0 | NA | NA | NA | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2005-07-12 03:00:00 | 8 | 6 | 35 | 0.29 | 27 | 49 | 204 | 0.10 | 17.2 | 94 | 1014 | -1 | 0 | NA | NA | NA | NA | Constitución |
27 TOL observations between the 2008-12-11 and the 2008-12-15 from the Constitucion station:
kable(neg_TOL <- air_data_2 %>% filter(TOL < 0))
| station | station_name | latitude | longitude | date_time_utc | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 | station_alias |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 03:00:00 | 3 | 2 | 3 | 0.10 | 9 | 39 | 203 | 0.96 | 5.5 | 80 | 1011.00 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 04:00:00 | 3 | 2 | 3 | 0.10 | 7 | 41 | 203 | 0.65 | 5.9 | 79 | 1010.15 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 05:00:00 | 3 | 2 | 8 | 0.10 | 5 | 38 | 225 | 1.79 | 6.0 | 77 | 1009.54 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 06:00:00 | 3 | 4 | 23 | 0.76 | 8 | 28 | 225 | 0.49 | 5.9 | 81 | 1009.52 | 35 | 0.4 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 20:00:00 | 2 | 13 | 45 | 0.82 | 9 | 24 | 158 | 0.44 | 4.8 | 77 | 987.08 | 34 | 0.4 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 23:00:00 | 3 | 2 | 11 | 0.91 | 10 | 44 | 247 | 1.59 | 3.5 | 82 | 987.34 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 00:00:00 | 2 | 2 | 8 | 0.79 | 5 | 45 | 180 | 0.87 | 3.8 | 77 | 986.57 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 01:00:00 | 3 | 2 | 10 | 0.84 | 5 | 44 | 226 | 1.58 | 3.4 | 78 | 987.19 | 34 | 2.6 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 02:00:00 | 3 | 2 | 6 | 0.81 | 2 | 50 | 180 | 1.37 | 1.7 | 87 | 986.68 | 34 | 0.2 | 0.1 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 03:00:00 | 3 | 2 | 5 | 0.81 | 2 | 51 | 203 | 1.36 | 2.2 | 85 | 986.61 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 04:00:00 | 3 | 2 | 5 | 0.85 | 3 | 52 | 203 | 1.47 | 2.3 | 86 | 987.05 | 34 | 1.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 05:00:00 | 3 | 2 | 5 | 0.86 | 3 | 51 | 180 | 1.17 | 2.2 | 86 | 986.36 | 34 | 0.8 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 06:00:00 | 3 | 2 | 8 | 0.88 | 2 | 49 | 203 | 1.14 | 2.8 | 82 | 986.05 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 07:00:00 | 3 | 2 | 9 | 0.86 | 4 | 47 | 203 | 0.84 | 3.3 | 78 | 985.92 | 34 | 0.2 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 08:00:00 | 3 | 2 | 10 | 0.89 | 8 | 46 | 225 | 1.78 | 3.6 | 76 | 986.65 | 34 | 0.2 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 09:00:00 | 2 | 2 | 6 | 0.92 | 8 | 51 | 159 | 1.48 | 1.9 | 85 | 987.42 | 35 | 2.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 12:00:00 | 2 | 7 | 31 | 0.36 | 4 | 35 | 225 | 1.44 | 2.7 | 87 | 987.90 | 60 | 0.6 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 16:00:00 | 48 | 11 | 26 | 0.94 | 2 | 29 | 5 | 1.20 | 4.1 | 81 | 991.82 | 44 | 0.6 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 21:00:00 | 4 | 7 | 41 | 0.45 | 15 | 32 | 226 | 0.50 | 4.1 | 78 | 996.81 | 34 | 0.6 | 0.1 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 22:00:00 | 4 | 7 | 36 | 0.59 | 17 | 31 | 180 | 0.38 | 3.2 | 85 | 997.08 | 34 | 0.6 | 0.1 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 02:00:00 | 3 | 2 | 3 | 0.29 | 8 | 55 | 6 | 1.41 | 7.6 | 66 | 998.75 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 03:00:00 | 3 | NA | NA | 0.26 | 15 | 59 | 5 | 2.22 | 8.1 | 62 | 999.56 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 04:00:00 | 4 | NA | NA | 0.26 | 7 | 55 | 5 | 1.05 | 8.1 | 64 | 1000.38 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 05:00:00 | 4 | 3 | 10 | 0.31 | 9 | 46 | 5 | 1.22 | 6.7 | 72 | 1000.93 | 34 | 0.6 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 06:00:00 | 3 | 4 | 21 | 0.30 | 10 | 33 | 6 | 0.88 | 7.1 | 75 | 1001.10 | 34 | 0.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 07:00:00 | 3 | 35 | 50 | 0.61 | 12 | 22 | 5 | 1.30 | 6.0 | 83 | 1001.81 | 34 | 1.8 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 10:00:00 | 2 | 12 | 20 | 0.32 | 17 | 30 | 5 | 2.67 | 8.6 | 80 | 1003.40 | 41 | 0.4 | 0.0 | -0.1 | -0.3 | NA | Constitución |
59 MXIL observations between the 2008-12-10 and the 2008-12-15 from the Constitucion station:
kable(neg_MXIL <- air_data_2 %>% filter(MXIL < 0))
| station | station_name | latitude | longitude | date_time_utc | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 | station_alias |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-10 23:00:00 | 3 | 4 | 25 | 0.11 | 15 | 29 | 180 | 0.01 | 4.9 | 90 | 1013.39 | 35 | 0.2 | 0.0 | 0.2 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 00:00:00 | 3 | 2 | 13 | 0.10 | 12 | 34 | 203 | 0.73 | 5.2 | 87 | 1012.97 | 35 | 0.0 | 0.0 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 01:00:00 | 3 | 2 | 6 | 0.10 | 7 | 38 | 158 | 0.41 | 5.2 | 84 | 1012.54 | 35 | 0.0 | 0.0 | 0.0 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 02:00:00 | 3 | 2 | 6 | 0.10 | 6 | 38 | 225 | 0.96 | 5.2 | 83 | 1011.63 | 35 | 0.4 | 0.0 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 03:00:00 | 3 | 2 | 3 | 0.10 | 9 | 39 | 203 | 0.96 | 5.5 | 80 | 1011.00 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 04:00:00 | 3 | 2 | 3 | 0.10 | 7 | 41 | 203 | 0.65 | 5.9 | 79 | 1010.15 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 05:00:00 | 3 | 2 | 8 | 0.10 | 5 | 38 | 225 | 1.79 | 6.0 | 77 | 1009.54 | 35 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 06:00:00 | 3 | 4 | 23 | 0.76 | 8 | 28 | 225 | 0.49 | 5.9 | 81 | 1009.52 | 35 | 0.4 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-11 09:00:00 | 3 | 21 | 60 | 0.18 | 11 | 9 | 158 | 0.17 | 6.8 | 78 | 1008.84 | 48 | 0.0 | 0.0 | 0.4 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-12 04:00:00 | 4 | 2 | 16 | 0.48 | 12 | 16 | 137 | 0.00 | 6.3 | 81 | 1007.54 | 35 | 0.0 | 0.1 | 0.4 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-12 05:00:00 | 5 | 24 | 35 | 0.71 | 19 | 5 | 5 | 0.00 | 6.4 | 82 | 1008.01 | 34 | 0.0 | 0.0 | 0.8 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 02:00:00 | 10 | 3 | 20 | 0.11 | 11 | 27 | 180 | 0.60 | 9.5 | 55 | 998.36 | 33 | 0.0 | 0.0 | 0.2 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 03:00:00 | 8 | 3 | 17 | 0.10 | 12 | 27 | 203 | 0.07 | 9.1 | 57 | 996.03 | 33 | 0.0 | 0.0 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 04:00:00 | 15 | 3 | 27 | 0.10 | 13 | 19 | 226 | 0.20 | 8.9 | 59 | 994.11 | 33 | 0.0 | 0.0 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 05:00:00 | 10 | 9 | 38 | 0.10 | 21 | 14 | 5 | 0.76 | 8.7 | 68 | 991.20 | 34 | 0.6 | 0.0 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 06:00:00 | 11 | 26 | 56 | 0.13 | 27 | 4 | 6 | 0.00 | 8.2 | 76 | 988.69 | 34 | 0.4 | 0.2 | 0.3 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 10:00:00 | 5 | 5 | 27 | 0.26 | 16 | 36 | 247 | 1.25 | 8.9 | 91 | 986.75 | 38 | 1.8 | 0.1 | 0.3 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 11:00:00 | 3 | 6 | 27 | 0.10 | 5 | 31 | 181 | 1.04 | 8.9 | 85 | 986.22 | 53 | 0.0 | 0.0 | 0.2 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 14:00:00 | 5 | 12 | 39 | 0.84 | 20 | 29 | 248 | 1.22 | 6.9 | 85 | 985.84 | 71 | 1.8 | 0.1 | 0.2 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 15:00:00 | 3 | 8 | 31 | 0.50 | 19 | 33 | 225 | 0.48 | 7.9 | 80 | 985.32 | 64 | 1.8 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 16:00:00 | 4 | 13 | 35 | 0.64 | 16 | 30 | 7 | 1.61 | 6.8 | 82 | 985.91 | 52 | 1.0 | 0.1 | 0.2 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 17:00:00 | 4 | 10 | 37 | 0.70 | 13 | 29 | 248 | 1.41 | 6.0 | 78 | 986.60 | 38 | 0.6 | 0.1 | 0.0 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 18:00:00 | 2 | 6 | 30 | 0.64 | 11 | 36 | 226 | 1.46 | 5.3 | 76 | 986.86 | 34 | 0.0 | 0.1 | 0.0 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 19:00:00 | 2 | 7 | 32 | 0.70 | 8 | 33 | 181 | 1.44 | 5.1 | 77 | 986.86 | 34 | 0.2 | 0.0 | 0.0 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 20:00:00 | 2 | 13 | 45 | 0.82 | 9 | 24 | 158 | 0.44 | 4.8 | 77 | 987.08 | 34 | 0.4 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 22:00:00 | 3 | 7 | 32 | 0.87 | 10 | 29 | 225 | 0.89 | 4.8 | 75 | 986.85 | 34 | 1.0 | 0.1 | 0.2 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-13 23:00:00 | 3 | 2 | 11 | 0.91 | 10 | 44 | 247 | 1.59 | 3.5 | 82 | 987.34 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 00:00:00 | 2 | 2 | 8 | 0.79 | 5 | 45 | 180 | 0.87 | 3.8 | 77 | 986.57 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 01:00:00 | 3 | 2 | 10 | 0.84 | 5 | 44 | 226 | 1.58 | 3.4 | 78 | 987.19 | 34 | 2.6 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 02:00:00 | 3 | 2 | 6 | 0.81 | 2 | 50 | 180 | 1.37 | 1.7 | 87 | 986.68 | 34 | 0.2 | 0.1 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 03:00:00 | 3 | 2 | 5 | 0.81 | 2 | 51 | 203 | 1.36 | 2.2 | 85 | 986.61 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 04:00:00 | 3 | 2 | 5 | 0.85 | 3 | 52 | 203 | 1.47 | 2.3 | 86 | 987.05 | 34 | 1.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 05:00:00 | 3 | 2 | 5 | 0.86 | 3 | 51 | 180 | 1.17 | 2.2 | 86 | 986.36 | 34 | 0.8 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 06:00:00 | 3 | 2 | 8 | 0.88 | 2 | 49 | 203 | 1.14 | 2.8 | 82 | 986.05 | 34 | 0.0 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 07:00:00 | 3 | 2 | 9 | 0.86 | 4 | 47 | 203 | 0.84 | 3.3 | 78 | 985.92 | 34 | 0.2 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 08:00:00 | 3 | 2 | 10 | 0.89 | 8 | 46 | 225 | 1.78 | 3.6 | 76 | 986.65 | 34 | 0.2 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 09:00:00 | 2 | 2 | 6 | 0.92 | 8 | 51 | 159 | 1.48 | 1.9 | 85 | 987.42 | 35 | 2.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 10:00:00 | 2 | 6 | 23 | 0.98 | 6 | 39 | 158 | 1.68 | 1.6 | 88 | 987.88 | 41 | 1.8 | 0.0 | 0.0 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 11:00:00 | 2 | 4 | 18 | 0.95 | 3 | 41 | 226 | 2.01 | 1.8 | 87 | 987.84 | 66 | 0.4 | 0.0 | 0.0 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 12:00:00 | 2 | 7 | 31 | 0.36 | 4 | 35 | 225 | 1.44 | 2.7 | 87 | 987.90 | 60 | 0.6 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 13:00:00 | 2 | 7 | 30 | 0.30 | 5 | 35 | 203 | 1.08 | 2.9 | 87 | 987.93 | 55 | 0.8 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 14:00:00 | 3 | 9 | 30 | 0.38 | 6 | 39 | 6 | 2.88 | 3.0 | 88 | 988.84 | 63 | 2.6 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 15:00:00 | 3 | 10 | 34 | 0.33 | 4 | 34 | 5 | 0.84 | 5.0 | 79 | 990.38 | 82 | 0.4 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 16:00:00 | 48 | 11 | 26 | 0.94 | 2 | 29 | 5 | 1.20 | 4.1 | 81 | 991.82 | 44 | 0.6 | 0.0 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 19:00:00 | 3 | 15 | 59 | 0.62 | 17 | 23 | 5 | 1.07 | 4.3 | 82 | 995.05 | 34 | 0.2 | 0.1 | 0.2 | -0.1 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 20:00:00 | 3 | 13 | 39 | 0.47 | 16 | 39 | 5 | 1.86 | 4.5 | 78 | 995.85 | 34 | 0.2 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 21:00:00 | 4 | 7 | 41 | 0.45 | 15 | 32 | 226 | 0.50 | 4.1 | 78 | 996.81 | 34 | 0.6 | 0.1 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 22:00:00 | 4 | 7 | 36 | 0.59 | 17 | 31 | 180 | 0.38 | 3.2 | 85 | 997.08 | 34 | 0.6 | 0.1 | -0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-14 23:00:00 | 3 | 4 | 28 | 0.55 | 12 | 33 | 159 | 1.07 | 3.2 | 84 | 997.54 | 34 | 0.2 | 0.1 | 0.1 | -0.2 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 00:00:00 | 3 | 3 | 24 | 0.51 | 14 | 32 | 5 | 0.87 | 3.1 | 86 | 998.28 | 34 | 0.4 | 0.1 | 0.0 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 01:00:00 | 3 | 2 | 9 | 0.36 | 18 | 46 | 6 | 0.77 | 6.0 | 73 | 998.51 | 34 | 0.4 | 0.1 | 0.0 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 02:00:00 | 3 | 2 | 3 | 0.29 | 8 | 55 | 6 | 1.41 | 7.6 | 66 | 998.75 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 03:00:00 | 3 | NA | NA | 0.26 | 15 | 59 | 5 | 2.22 | 8.1 | 62 | 999.56 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 04:00:00 | 4 | NA | NA | 0.26 | 7 | 55 | 5 | 1.05 | 8.1 | 64 | 1000.38 | 34 | 0.0 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 05:00:00 | 4 | 3 | 10 | 0.31 | 9 | 46 | 5 | 1.22 | 6.7 | 72 | 1000.93 | 34 | 0.6 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 06:00:00 | 3 | 4 | 21 | 0.30 | 10 | 33 | 6 | 0.88 | 7.1 | 75 | 1001.10 | 34 | 0.4 | 0.0 | -0.2 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 07:00:00 | 3 | 35 | 50 | 0.61 | 12 | 22 | 5 | 1.30 | 6.0 | 83 | 1001.81 | 34 | 1.8 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 10:00:00 | 2 | 12 | 20 | 0.32 | 17 | 30 | 5 | 2.67 | 8.6 | 80 | 1003.40 | 41 | 0.4 | 0.0 | -0.1 | -0.3 | NA | Constitución |
| 1 | Estación Avenida Constitución | 43.52981 | -5.673428 | 2008-12-15 11:00:00 | 2 | 14 | 25 | 0.36 | 20 | 25 | 5 | 2.26 | 8.3 | 84 | 1004.68 | 47 | 0.6 | 0.0 | 0.0 | -0.2 | NA | Constitución |
There are not many cases. We replace them all by NAs and call again the summary function (to do: ask to the data owner about this detail).
air_data_2 <- air_data_2 %>% mutate(SO2 = replace(SO2, SO2 < 0, NA),
RS = replace(RS, RS < 0, NA),
TOL = replace(TOL, TOL < 0, NA),
MXIL = replace(MXIL, MXIL < 0, NA))
summary(air_data_2)
## station station_name latitude
## 1 :157727 Estación Avenida Argentina :157798 Min. :43.52
## 10: 74630 Estación Avenida Castilla :157409 1st Qu.:43.53
## 11: 17544 Estación Avenida Constitución :157727 Median :43.54
## 2 :157798 Estación Avenida Hermanos Felgueroso:157666 Mean :43.53
## 3 :157666 Estación de Montevil : 74630 3rd Qu.:43.54
## 4 :157409 Estación Santa Bárbara : 17544 Max. :43.54
##
## longitude date_time_utc SO2
## Min. :-5.699 Min. :2000-01-01 00:00:00 Min. : 0.00
## 1st Qu.:-5.673 1st Qu.:2005-02-25 05:00:00 1st Qu.: 4.00
## Median :-5.672 Median :2010-02-23 11:00:00 Median : 6.00
## Mean :-5.670 Mean :2009-09-06 07:33:13 Mean : 10.12
## 3rd Qu.:-5.658 3rd Qu.:2014-04-09 06:00:00 3rd Qu.: 11.00
## Max. :-5.646 Max. :2018-01-01 00:00:00 Max. :2662.00
## NA's :33796
## NO NO2 CO PM10
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 4.40 1st Qu.: 16.00 1st Qu.: 0.22 1st Qu.: 19.00
## Median : 10.00 Median : 28.00 Median : 0.36 Median : 30.00
## Mean : 21.71 Mean : 32.38 Mean : 0.49 Mean : 36.26
## 3rd Qu.: 23.00 3rd Qu.: 45.00 3rd Qu.: 0.59 3rd Qu.: 46.00
## Max. :1248.00 Max. :1003.20 Max. :58.20 Max. :1000.00
## NA's :17013 NA's :16470 NA's :90390 NA's :88622
## O3 dd vv TMP
## Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. :-40.0
## 1st Qu.: 17.00 1st Qu.: 96.0 1st Qu.: 0.2 1st Qu.: 10.9
## Median : 37.00 Median :159.0 Median : 0.7 Median : 14.7
## Mean : 39.32 Mean :161.8 Mean : 1.0 Mean : 14.6
## 3rd Qu.: 57.00 3rd Qu.:228.0 3rd Qu.: 1.5 3rd Qu.: 18.4
## Max. :998.00 Max. :360.0 Max. :29.8 Max. : 47.4
## NA's :31441 NA's :494134 NA's :493893 NA's :494151
## HR PRB RS LL
## Min. : 0.0 Min. : 800 Min. : 0.0 Min. : 0.0
## 1st Qu.: 69.0 1st Qu.:1007 1st Qu.: 17.0 1st Qu.: 0.0
## Median : 80.0 Median :1013 Median : 46.0 Median : 0.0
## Mean : 78.3 Mean :1012 Mean : 125.2 Mean : 0.1
## 3rd Qu.: 89.0 3rd Qu.:1018 3rd Qu.: 149.0 3rd Qu.: 0.0
## Max. :123.0 Max. :1282 Max. :1470.0 Max. :24.6
## NA's :494176 NA's :494019 NA's :494275 NA's :494124
## BEN TOL MXIL PM25
## Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 0.0
## 1st Qu.: 0.1 1st Qu.: 0.4 1st Qu.: 0.2 1st Qu.: 5.0
## Median : 0.3 Median : 1.0 Median : 0.3 Median : 9.0
## Mean : 0.5 Mean : 2.5 Mean : 1.3 Mean : 11.3
## 3rd Qu.: 0.5 3rd Qu.: 2.5 3rd Qu.: 0.9 3rd Qu.: 15.0
## Max. :22.5 Max. :196.0 Max. :220.0 Max. :947.0
## NA's :629358 NA's :629407 NA's :635182 NA's :554185
## station_alias
## Argentina :157798
## Castilla :157409
## Constitución :157727
## H. Felgueroso:157666
## Montevil : 74630
## Santa Bárbara: 17544
##
We take a look to the data completeness. What proportion of nas do we have by variable, station, year, etc?
data_completeness <- air_data_2 %>%
group_by(station_alias, year = year(date_time_utc)) %>%
summarise_all(funs(round(sum(!is.na(.))/n(), 2))) %>%
select(-c(3:7, 25:28)) # These columns do not have any na. We exclude them.
kable(head(data_completeness, 10))
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Argentina | 2000 | 0.99 | 0.97 | 0.97 | 0.96 | 0.94 | 0.97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2001 | 0.99 | 0.99 | 0.99 | 0.98 | 0.97 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2002 | 1.00 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2003 | 0.99 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2004 | 0.98 | 0.96 | 0.97 | 0.99 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2005 | 0.98 | 0.96 | 0.98 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2006 | 0.92 | 0.90 | 0.92 | 0.92 | 0.93 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2007 | 0.98 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2008 | 0.98 | 0.96 | 0.98 | 0.97 | 0.98 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2009 | 1.00 | 1.00 | 1.00 | 0.98 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
We are going to check the data completeness by station:
Constitución: There is data registered from the variables SO2, NO, NO2, CO, PM10, 03, dd, vv, TMP, HR, PRB, HS and LL since the year 2000. There are measurements of the variables BEN, TOL and MXIL since the year 2006 (only 0.01% ). The PM25 particles are monitored since the year 2008 (2008: only covered 0,02% of the year). During the year 2008 the completeness of several variables (HR, PRB, HS, LL, BEN, TOL y MXIL) decrease until 88% (to do: check there was not caused by a data importing problem.)
constitucion_data <- data_completeness %>% filter(station_alias == 'Constitución')
kable(constitucion_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Constitución | 2000 | 0.97 | 0.95 | 0.95 | 0.97 | 0.92 | 0.93 | 0.96 | 0.98 | 0.96 | 0.95 | 0.97 | 0.95 | 0.96 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2001 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2002 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2003 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2004 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2005 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.00 | 0.00 | 0.00 | 0.00 |
| Constitución | 2006 | 0.91 | 0.91 | 0.91 | 0.90 | 0.91 | 0.91 | 0.91 | 0.91 | 0.91 | 0.91 | 0.91 | 0.91 | 0.91 | 0.01 | 0.01 | 0.01 | 0.00 |
| Constitución | 2007 | 0.98 | 0.99 | 0.99 | 0.97 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.00 |
| Constitución | 2008 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.02 |
| Constitución | 2009 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Constitución | 2010 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 | 0.99 |
| Constitución | 2011 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.98 | 0.98 | 0.99 |
| Constitución | 2012 | 0.97 | 0.97 | 0.97 | 0.96 | 0.97 | 0.96 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.96 | 0.96 | 0.96 | 0.97 |
| Constitución | 2013 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 | 1.00 |
| Constitución | 2014 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 | 1.00 |
| Constitución | 2015 | 0.98 | 0.98 | 0.98 | 0.98 | 0.99 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.98 | 0.32 | 0.98 |
| Constitución | 2016 | 0.95 | 0.95 | 0.95 | 0.95 | 0.95 | 0.95 | 0.98 | 0.98 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.90 | 0.90 | 0.90 | 0.95 |
| Constitución | 2017 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 | 1.00 |
| Constitución | 2018 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Argentina: data since the year 2000. Variables: SO2, NO, NO2, CO, PM10 and 03.
argentina_data <- data_completeness %>% filter(station_alias == 'Argentina')
kable(argentina_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Argentina | 2000 | 0.99 | 0.97 | 0.97 | 0.96 | 0.94 | 0.97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2001 | 0.99 | 0.99 | 0.99 | 0.98 | 0.97 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2002 | 1.00 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2003 | 0.99 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2004 | 0.98 | 0.96 | 0.97 | 0.99 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2005 | 0.98 | 0.96 | 0.98 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2006 | 0.92 | 0.90 | 0.92 | 0.92 | 0.93 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2007 | 0.98 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2008 | 0.98 | 0.96 | 0.98 | 0.97 | 0.98 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2009 | 1.00 | 1.00 | 1.00 | 0.98 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2010 | 0.99 | 0.99 | 1.00 | 0.99 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2011 | 0.98 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2012 | 0.99 | 0.96 | 0.96 | 0.96 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2013 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2014 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2015 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2016 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2017 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Argentina | 2018 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
H. Felgueroso: data since the year 2000. Variables: SO2, NO, NO2, CO, PM10 and 03. During the year 2006 the completeness of the data decrease until 88% (to do: check there was not caused by a data importing problem.)
felgueroso_data <- data_completeness %>% filter(station_alias == 'H. Felgueroso')
kable(felgueroso_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| H. Felgueroso | 2000 | 0.97 | 0.96 | 0.96 | 0.97 | 0.96 | 0.96 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2001 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2002 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2003 | 0.98 | 0.98 | 0.98 | 0.97 | 0.98 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2004 | 0.98 | 0.97 | 0.97 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2005 | 0.97 | 0.96 | 0.96 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2006 | 0.88 | 0.87 | 0.87 | 0.90 | 0.90 | 0.90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2007 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2008 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2009 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2010 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2011 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2012 | 0.96 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2013 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2014 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2015 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2016 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2017 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| H. Felgueroso | 2018 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Castilla: data since the year 2000. Variables: SO2, NO, NO2, CO, PM10 and 03. During the year 2015 the completeness of the data decrease until 77% (to do: check there was not caused by a data importing problem.)
castilla_data <- data_completeness %>% filter(station_alias == 'Castilla')
kable(castilla_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Castilla | 2000 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.95 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2001 | 0.98 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2002 | 0.99 | 0.99 | 0.99 | 0.97 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2003 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2004 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2005 | 0.99 | 0.95 | 0.95 | 0.98 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2006 | 0.91 | 0.91 | 0.91 | 0.91 | 0.92 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2007 | 0.99 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2008 | 0.95 | 0.96 | 0.96 | 0.95 | 0.96 | 0.96 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2009 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2010 | 0.92 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2011 | 0.97 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2012 | 0.97 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2013 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2014 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2015 | 0.77 | 0.76 | 0.76 | 0.77 | 0.76 | 0.77 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2016 | 0.98 | 0.99 | 0.99 | 0.99 | 0.97 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2017 | 0.97 | 0.99 | 0.99 | 0.99 | 0.98 | 0.97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Castilla | 2018 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Montevil: Data since the year 2009. Variables: SO2, NO, NO2, 03, dd, vv, TMP, HR, PRB, HS, LL and PM25.
montevil_data <- data_completeness %>% filter(station_alias == 'Montevil')
kable(montevil_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Montevil | 2009 | 0.91 | 0.93 | 0.93 | 0 | 0 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0.93 | 0 | 0 | 0 | 0.93 |
| Montevil | 2010 | 0.99 | 1.00 | 1.00 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0.92 |
| Montevil | 2011 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2012 | 1.00 | 1.00 | 1.00 | 0 | 0 | 1.00 | 0.98 | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2013 | 1.00 | 1.00 | 1.00 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2014 | 1.00 | 1.00 | 1.00 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2015 | 0.99 | 1.00 | 1.00 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2016 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
| Montevil | 2017 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0 | 0 | 0 | 0.99 |
| Montevil | 2018 | 1.00 | 1.00 | 1.00 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 1.00 |
Santa Bárbara: Data since the year 2016. Variables: NO, NO2, CO, PM10, 03 and PM25
barbara_data <- data_completeness %>% filter(station_alias == 'Santa Bárbara')
kable(barbara_data)
| station_alias | year | SO2 | NO | NO2 | CO | PM10 | O3 | dd | vv | TMP | HR | PRB | RS | LL | BEN | TOL | MXIL | PM25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Santa Bárbara | 2016 | 0 | 0.97 | 0.97 | 0.98 | 0.98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.98 |
| Santa Bárbara | 2017 | 0 | 0.98 | 0.98 | 0.99 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.00 |
| Santa Bárbara | 2018 | 0 | 1.00 | 1.00 | 1.00 | 1.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.00 |
All the stations have 2018 data, but it is just 6 observations. We drop them to avoid problems when visualising the data.
observations_per_year <- air_data_2 %>% group_by(year = year(date_time_utc)) %>%
summarise(n = n())
kable(observations_per_year)
| year | n |
|---|---|
| 2000 | 35136 |
| 2001 | 35040 |
| 2002 | 35040 |
| 2003 | 35040 |
| 2004 | 35136 |
| 2005 | 35040 |
| 2006 | 34939 |
| 2007 | 34921 |
| 2008 | 35136 |
| 2009 | 39541 |
| 2010 | 43800 |
| 2011 | 43800 |
| 2012 | 43920 |
| 2013 | 43800 |
| 2014 | 43800 |
| 2015 | 43416 |
| 2016 | 52703 |
| 2017 | 52560 |
| 2018 | 6 |
air_data_2$year <- year(air_data_2$date_time_utc)
air_data_2 <- air_data_2 %>% filter(year != '2018')
We add to the dataset several more time variables.
air_data_2$month <- month(air_data_2$date_time_utc)
air_data_2$year_month_day <- ymd(air_data_2$date_time_utc)
air_data_2$week_day <- wday(air_data_2$date_time_utc, week_start = getOption("lubridate.week.start", 1))
air_data_2$hour <- hour(air_data_2$date_time_utc)
We take a look to the general trend of several indicators through the last 18 years
# We calcule the yearly mean of the pollutants levels.
year_avgs <- air_data_2 %>% select(station_alias, date_time_utc, PM10, PM25, SO2, NO2, NO, O3, BEN, CO, MXIL, TOL) %>%
group_by(station_alias, year = year(date_time_utc)) %>%
summarise_all(funs(mean(., na.rm = TRUE))) %>%
select(-date_time_utc) # We drop this variable
# We convert the table to long format
year_avgs_long <- gather(year_avgs, contaminante, value, 3:length(year_avgs)) %>%
filter(!(station_alias == 'Constitución' & year == '2006' & contaminante %in% c('BEN', 'MXIL', 'TOL'))) %>% # We filter this data because is only completed in 0.01%
filter(!(station_alias == 'Constitución' & year == '2008' & contaminante == 'PM25')) # We filter this data because is only completed in 0.02%
# We present the data in a grid of graphs
ggplot(year_avgs_long, aes(x = year, y = value)) +
geom_line() +
facet_grid(contaminante~station_alias,scales="free_y") +
theme(axis.text = element_text(size = 6))
We drop the Santa Bárbara and Montevil stations. These stations have much less data and the behavior of their variables are significantly different (they are sub-urban stations). So, we take them out from the analysis for now.
air_data_3 <- air_data_2 %>% filter(station_alias != 'Montevil' ,
station_alias != 'Santa Bárbara' )
# We calcule the yearly mean of the pollutants levels.
year_avgs <- air_data_3 %>% select(station_alias, date_time_utc, PM10, PM25, SO2, NO2, NO, O3, BEN, CO, MXIL, TOL) %>%
group_by(station_alias, year = year(date_time_utc)) %>%
summarise_all(funs(mean(., na.rm = TRUE))) %>%
select(-date_time_utc) # quito ahora esta variable, porque no tiene sentido que salga su media.
# We convert the table to long format
year_avgs_long <- gather(year_avgs, contaminante, value, 3:length(year_avgs)) %>%
filter(!(station_alias == 'Constitución' & year == '2006' & contaminante %in% c('BEN', 'MXIL', 'TOL'))) %>% # We filter this data because is only completed in 0.01%
filter(!(station_alias == 'Constitución' & year == '2008' & contaminante == 'PM25')) # We filter this data because is only completed in 0.02%
# We present the data in a grid of graphs
ggplot(year_avgs_long, aes(x = year, y = value)) +
geom_line() +
facet_grid(contaminante~station_alias,scales="free_y") +
theme(axis.text = element_text(size = 6))
# We calcule the hourly mean of the pollutants levels.
hour_avgs <- air_data_3 %>% select(station_alias, hour, PM10, PM25, SO2, NO2, NO, O3, BEN, CO, MXIL, TOL) %>%
group_by(station_alias, hour) %>%
summarise_all(funs(mean(., na.rm = TRUE))) # quito ahora esta variable, porque no tiene sentido que salga su media.
# We convert the table to long format
hour_avgs_long <- gather(hour_avgs, contaminante, value, 3:length(hour_avgs))
# We present the data in a grid of graphs
ggplot(hour_avgs_long, aes(x = hour, y = value)) +
geom_line() +
facet_grid(contaminante~station_alias,scales="free_y") +
theme(axis.text = element_text(size = 6))
# We calcule the monthly mean of the pollutants levels.
month_avgs <- air_data_3 %>% select(station_alias, month, PM10, PM25, SO2, NO2, NO, O3, BEN, CO, MXIL, TOL) %>%
group_by(station_alias, month) %>%
summarise_all(funs(mean(., na.rm = TRUE))) # quito ahora esta variable, porque no tiene sentido que salga su media.
# We convert the table to long format
month_avgs_long <- gather(month_avgs, contaminante, value, 3:length(month_avgs))
# We present the data in a grid of graphs
ggplot(month_avgs_long, aes(x = month, y = value)) +
geom_line() +
facet_grid(contaminante~station_alias,scales="free_y") +
theme(axis.text = element_text(size = 6))
# We calcule the weekly mean of the pollutants levels.
week_day_avgs <- air_data_3 %>% select(station_alias, week_day, PM10, PM25, SO2, NO2, NO, O3, BEN, CO, MXIL, TOL) %>%
group_by(station_alias, week_day) %>%
summarise_all(funs(mean(., na.rm = TRUE))) # quito ahora esta variable, porque no tiene sentido que salga su media.
# We convert the table to long format
week_day_avgs_long <- gather(week_day_avgs, contaminante, value, 3:length(week_day_avgs))
# We present the data in a grid of graphs
ggplot(week_day_avgs_long, aes(x = week_day, y = value)) +
geom_line() +
facet_grid(contaminante~station_alias,scales="free_y") +
theme(axis.text = element_text(size = 6))
We are going to use as base model for our predictions the ARIMA method. And, as first step we are going to try to predict the values of the PM10 pollutant for the Constitución station.
We create the dataset pm10 with PM10 values from the Constitución Station and we execute a summary
pm10 <- air_data_3 %>% filter(station_alias == 'Constitución') %>%
select(date_time_utc, PM10)
summary(pm10)
## date_time_utc PM10
## Min. :2000-01-01 00:00:00 Min. : 0.00
## 1st Qu.:2004-06-30 23:15:00 1st Qu.: 19.00
## Median :2009-01-02 00:30:00 Median : 29.00
## Mean :2008-12-31 18:50:45 Mean : 34.39
## 3rd Qu.:2013-07-02 23:45:00 3rd Qu.: 44.00
## Max. :2017-12-31 23:00:00 Max. :888.00
## NA's :3106
25% of the values are between 44.00 and 888.00. 888.00 is a value really extreme. How many extreme values (outliers) do we have in this series? We plot all the values to visualise this:
ggplot(pm10, aes(x = date_time_utc, y = PM10)) +
geom_point(alpha = 0.1)
We have very few values greater than 250. So, it doesn’t seem we have a problem with the outliers (Pending: A PM10 level of 880 is something possible or is it likely to be a monitoring error?).
Daily averages
We create a new dataset with the PM10 daily averages and we plot them in a new graphic. We add a trend line too. There is a clear downward trend in the measurements and we have many fewer extreme values during the last decade. It seems like we have two very clear “epochs” in the data, before and after the year 2008.
pm10_day_avg <- pm10 %>% group_by(day = date(date_time_utc)) %>%
summarise(day_avg = mean(PM10, na.rm = TRUE))
ggplot(pm10_day_avg, aes(x = day, y = day_avg, , colour = day_avg)) +
geom_point(alpha = 0.5) +
geom_smooth(color = "grey", alpha = 0.2) +
scale_colour_gradientn(colours = terrain.colors(10)) +
theme(legend.position = c(0.3, 0.9),
legend.background = element_rect(colour = "transparent", fill = NA), legend.direction = "horizontal") +
labs(colour = "PM10 daily average (colour scale)", x = "Year", y = "PM10 daily average", title = "PM10 daily average - 2000-2017 evolution (Constitución Station)")
We identify a very clear trend through the years on the last graph. But, as we already saw before on the grid graphs there are other things happening at the same time.
year_const <- year_avgs_long %>% filter(station_alias == "Constitución", contaminante == 'PM10')
plot1 <- ggplot(year_const, aes(x = year, y = value)) +
geom_line()
month_const <- month_avgs_long %>% filter(station_alias == "Constitución", contaminante == 'PM10')
plot2 <- ggplot(month_const, aes(x = month, y = value)) +
geom_line()
week_day_const <- week_day_avgs_long %>% filter(station_alias == "Constitución", contaminante == 'PM10')
plot3 <- ggplot(week_day_const, aes(x = week_day, y = value)) +
geom_line()
hour_const <- hour_avgs_long %>% filter(station_alias == "Constitución", contaminante == 'PM10')
plot4 <- ggplot(hour_const, aes(x = hour, y = value)) +
geom_line()
grid.arrange(plot1, plot2, plot3, plot4, ncol = 2)
As a first step we are going to try to predict the monthly levels of PM10. We create a time series object with the monthly averages.
year_month_pm10 <- pm10 %>% group_by(year = year(date_time_utc), month = month(date_time_utc)) %>%
summarise(year_month_avg = mean(PM10, na.rm = TRUE))
year_month_pm10 <- year_month_pm10 %>% unite("year_month", c("year", "month"), sep = "-")
pm10_month_ts <- ts(year_month_pm10$year_month_avg, start = 2000, frequency = 12)
# We create another time series object with the period 2000-2008
year_month_pm10_1 <- year_month_pm10 %>% filter(year_month <= '2009-01')
pm10_1st_ts <- ts(year_month_pm10_1$year_month_avg, start = 2000, frequency = 12)
# We create another time series object with the period 2009-2017
year_month_pm10_2 <- year_month_pm10 %>% filter(year_month > '2009-0')
pm10_2nd_ts <- ts(year_month_pm10_2$year_month_avg, start = 2009, frequency = 12)
We have a time series of 216 observations. 18 years x 12 months (2000-2017)
glimpse(pm10_month_ts)
## Time-Series [1:216] from 2000 to 2018: 61.3 62.2 62.3 41.1 50.4 ...
We plot the complete series. As we saw before with the daily averages, we identify two obvious things at first sight: The general downward trend during the whole series and at least two very different cycles in the data, the first between 2000 and 2007, and the second from 2008 to the end of the series. Beyond this, the first cycle seems to be much more irregular than the second one. This fact can play an important role in choosing the period to fit a prediction model.
autoplot(pm10_month_ts)
We plot a seasonal plot for all the data but it is not very easy to read.
ggseasonplot(pm10_month_ts, year.labels=TRUE, year.labels.left=TRUE) +
ylab("PM10") +
ggtitle("Seasonal plot: PM10 - Constitución")
If we reduce the period range to 2009-2017 we can observe more easily the seasonal trend of the data. The year starts with low PM10 levels (January and February). We have a peak in March and a decrease in April-May (with some exceptions). The levels use to increase during September and October and fall in November. And we finish the year with another increase in December.
ggseasonplot(pm10_2nd_ts, year.labels=TRUE, year.labels.left=TRUE) +
ylab("PM10") +
ggtitle("Seasonal plot: PM10 - Constitución (2009-2017)")
pm10_month_ts %>% ggtsdisplay(main="")
As the data is non-stationary, we apply the diff function
pm10_diff <- diff(pm10_month_ts)
We plot the differenced series and the ACF and PACF plots
pm10_diff %>% ggtsdisplay(main="")
The auto-correlation graphs show certain patterns which could be caused by seasonal effects.
We generate an ARIMA model with the auto.arima function.
fit <- auto.arima(pm10_month_ts)
fit
## Series: pm10_month_ts
## ARIMA(1,1,2)(2,0,0)[12]
##
## Coefficients:
## ar1 ma1 ma2 sar1 sar2
## -0.5343 0.0470 -0.4579 0.1121 0.1403
## s.e. 0.1737 0.1676 0.0855 0.0705 0.0717
##
## sigma^2 estimated as 38.21: log likelihood=-694.76
## AIC=1401.52 AICc=1401.93 BIC=1421.75
fit <- auto.arima(pm10_month_ts, approximation = FALSE, stepwise = FALSE)
fit
## Series: pm10_month_ts
## ARIMA(4,1,1)
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1
## 0.3995 0.0555 0.2414 -0.2766 -0.8703
## s.e. 0.0933 0.0784 0.0762 0.0711 0.0783
##
## sigma^2 estimated as 35.71: log likelihood=-687.41
## AIC=1386.83 AICc=1387.23 BIC=1407.05
fit <- auto.arima(pm10_2nd_ts, approximation = FALSE, stepwise = FALSE)
fit
## Series: pm10_2nd_ts
## ARIMA(0,1,2)
##
## Coefficients:
## ma1 ma2
## -0.5468 -0.3444
## s.e. 0.0934 0.0935
##
## sigma^2 estimated as 19.94: log likelihood=-311.64
## AIC=629.27 AICc=629.51 BIC=637.29
autoplot(forecast(fit))
fit <- auto.arima(pm10_2nd_ts)
fit
## Series: pm10_2nd_ts
## ARIMA(1,1,1)(1,0,0)[12]
##
## Coefficients:
## ar1 ma1 sar1
## 0.3278 -0.9384 0.1470
## s.e. 0.1046 0.0401 0.0991
##
## sigma^2 estimated as 19.82: log likelihood=-310.9
## AIC=629.8 AICc=630.19 BIC=640.49
autoplot(forecast(fit))
fit <- auto.arima(pm10_2nd_ts, approximation = FALSE)
fit
## Series: pm10_2nd_ts
## ARIMA(1,1,1)(1,0,0)[12]
##
## Coefficients:
## ar1 ma1 sar1
## 0.3278 -0.9384 0.1470
## s.e. 0.1046 0.0401 0.0991
##
## sigma^2 estimated as 19.82: log likelihood=-310.9
## AIC=629.8 AICc=630.19 BIC=640.49
autoplot(forecast(fit, h = 12))
We think that the model selected by the auto.arima function generate very flat forecasts. The forecast line for 2018 does not seem to collect the variability of the past series. We set the Q parameter to 1 in order to try to get a better result.
fit2 <- Arima(pm10_2nd_ts, order=c(1,1,1), seasonal=c(1,0,1))
fit2
## Series: pm10_2nd_ts
## ARIMA(1,1,1)(1,0,1)[12]
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.3722 -0.9463 0.9988 -0.9747
## s.e. 0.1023 0.0405 0.0093 0.0981
##
## sigma^2 estimated as 16.5: log likelihood=-306.62
## AIC=623.25 AICc=623.84 BIC=636.61
The shape of this new forecast seems to be more closed to the historical data (Pending: generate more differente models and compare their metrics)
autoplot(forecast(fit2, h = 12))
But we do not have 2018 data yet to test these forecasts. So, in order to test our models we will have to divide the data between in two groups, train and test.