Wrangle OpenAQ extraordinary data

M. Salmon

Intro: who am I

Who am I?

  • BSc Biology, MSc Ecology, MPH, PhD in statistics

  • Data manager and statistician for the CHAI project

  • An open-data and open-source enthusiast

What I am NOT

Christa said I was an “OpenAQ data wrangler extraordinaire”… LIES!

But I like to wrangle OpenAQ extraordinary data!

Why wanting to get OpenAQ data?

A treasure from New Zealand

No! R!

Language for statistical computing

R capabilities

  • Data wrangling, modeling, visualization

  • Reporting, interactive plots and apps

  • Collaborative effort like Wikipedia

Bringing OpenAQ to R!

What is a package?

What is a package?

  • An add-in to R

  • A collection of code and documentation

ropenaq

ropenaq

library("purrr")
library("dplyr")
library("ropenaq")
count <- aq_measurements(city = "Sarajevo", 
                         country = "BA")
count <- attr(count, "meta")$found

sarajevo <- 1:(ceiling(count/10000)) %>%
  map(function(page){
    aq_measurements(city = "Sarajevo", 
                    country = "BA",
                    limit = 10000,
                    page = page)
  }) %>%
  bind_rows()

Quality control

  • Review of the software

  • Improvement for robustness and ease of use

  • Part of an open science project!

landscape

Some wrangling cases

Map of all OpenAQ locations

Delhi PM2.5 daily time series

vs. Indian health categories

Delhi PM2.5 vs. WHO limit

Sarajevo PM10 stations

WHO daily limit, PM10 in Sarajevo

Coyhaique vs. Ulaanbaatar

Fireworks 4th of July via PM2.5

What can you do?

Learn R!

How to learn R

  • Datacamp, Coursera

  • Hadley Wickham’s books

  • Online tutorials

How to contribute to ropenaq

  • Report bugs!

  • Help build the documentation.

  • Show me examples of figures, for adding how-to in the documentation.

Thank you!