We can download and process the air quality dataset, like below.
library(dplyr); library(lubridate)
download.file("https://aqs.epa.gov/aqsweb/airdata/annual_conc_by_monitor_2017.zip",
destfile = "annual_conc_by_monitor_2017.zip")
unzip("annual_conc_by_monitor_2017.zip")
anc_sub <- tbl_df(read.csv("annual_conc_by_monitor_2017.csv")) %>%
filter(Parameter.Name %in% c("PM2.5 - Local Conditions", "Ozone", "Sulfur dioxide")) %>%
mutate(X1st.Max.DateTime = ymd_hm(X1st.Max.DateTime))
And then apply calculations, like linear model, for a given subset.
my_model <- with(anc_sub, lm(X1st.Max.Value ~ X1st.Max.DateTime))
summary(my_model)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.412931e+03 6.801735e+02 6.487949 9.202738e-11
## X1st.Max.DateTime -2.952874e-06 4.573397e-07 -6.456630 1.131112e-10
This is resource-consuming (run on our side) and code-heavy.