1 Introduction

1.1 Welcome

This is a brief newsletter from the openair project. The current project web pages will soon be updated (currently hosted by King’s College London here http://www.openair-project.org/). The openair website provides downloads to previous newsletters, however from now on these will be published as html documents on RPubs.

This newsletter covers some of new developments and changes to the openair R package. In common with previous versions of openair, this version brings some new functionality, bug fixes and refinements.

openair development has moved to Github for version control and to allow users to contribute, ask questions and report problems. See below for more information. The Github web pages also provide a link to the comprehensive openair manual.

This newsletter was produced using R version 3.2.1 and openair version 1.6. To update to the most recent version of openair, type update.packages() and make sure you are using the most recent version of R. If you have difficulty, start a new R session (please use the most recent version of R — 3.2.1 and type install.packages("openair", dependencies = TRUE).

1.2 Development

The development of openair has now moved to Github. The main page for openair is https://github.com/davidcarslaw/openair. Github (and specifically git) are used for version control. Together they work very well together making it much easier to develop R packages and importantly make it possible for others to contribute.

One of the advantages of Github is that it has an Issues page where bugs, suggestions or questions can be asked (see top right of the web page).

The Issues page of Github will now be the main way in which to ask questions, raise bug reports and make suggestions in openair

The Issues page can be found here https://github.com/davidcarslaw/openair/issues.

One of the advantages of this approach is that all users can see the issues raised and get a clear idea of if and when they will be dealt with.

For those familiar with Github and R, please feel free to contribute!

1.3 Citation information

The main citation for the openair package is:

Carslaw, D.C. and K. Ropkins, (2012). openair — an R package for air quality data analysis. Environmental Modelling & Software. Volume 27–28, 52–61.

Details of how to cite the manual are given in the manual.

2 Windflow plots

Several users requested the ability to show indications of wind speed and direction on plots. This is now possible in scatterPlot and timePlot.

If wind speed (ws) and wind direction (wd) are available they can be used in plots and shown as ‘wind vectors’. Plotting data in this way conveys more information in an easy to understand way, which works best for relatively short time periods e.g. a pollution episode lasting a few days. As an example below shows the first 48 hours of NOx and NO2 data with wind arrows shown. The arrows are controlled by a list of option that control the length, shape and colour of the arrows. The maximum length of the arrow plotted is a fraction of the plot dimension with the longest arrow being scale of the plot x-y dimension. Note, if the plot size is adjusted manually by the user it should be re-plotted to ensure the correct wind angle. The list may contain other options to panel.arrows in the lattice package. Other useful options include length, which controls the length of the arrow head and angle, which controls the angle of the arrow head.

First we need to load the openair package:

library(openair)
## Loading required package: lazyeval
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Loading required package: maps
timePlot(head(mydata, 48), pollutant = c("nox", "no2"), 
         windflow = list(scale = 0.1, lwd = 2, col = "pink"), 
         lwd = 3, group = FALSE, 
         ylab = "concentration (ug/m3)")

The Figure below shows an example of using the windflow option in scatterPlot. The Figure also sets many other options including showing the concentration of O3 as a colour, setting the colour scale used and selecting a few days of interest using the selectByDate function. The plot shows that when the wind direction changes to northerly, the concentration of NO2 decreases and that of O increases.

scatterPlot(selectByDate(mydata, start = "1/6/2001", end = "5/6/2001"), 
            x = "date", y = "no2", z = "o3", 
            col = "increment", 
            windflow = list(scale = 0.15), 
            key.footer = "o3\n (ppb)", main = NULL, ylab = "no2 (ppb)")

3 Normalise in pollutionRose

It is sometimes useful to more clearly understand the contributions from wind directions that have low frequencies. For example, for a pollution rose of SO2 there are few occurrences of easterly winds making it difficult to see how the concentration intervals are made up. Try:

pollutionRose(mydata, pollutant = "so2", seg = 1)

However, each wind sector can be normalised to give a probability between 0 and 1 to help show the variation within each wind sector more clearly. An example is shown below where for easterly winds it is now clearer that a greater proportion of the time the concentration is made up of high SO2 concentrations. In this plot each wind sector is scaled between 0 and 1. Also shown with a black like is an indication of the wind direction frequency to remind us that winds from the east occur with a low frequency.

pollutionRose(mydata, pollutant = "so2", normalise = TRUE, seg = 1)

4 Grouping in TaylorDiagram

Model evaluation is often concerned with evaluating different models at different receptors. Using the example in the openair manual (data loaded below) it is straightforward to compare model performance at one receptor.

First, load some data:

load("~/openair/Data/modelData.RData")
head(modTest)
##         site                date o3   mod   group
## 1 Aston.Hill 2006-01-01 00:00:00 NA    NA model 1
## 2 Aston.Hill 2006-01-01 01:00:00 74 65.28 model 1
## 3 Aston.Hill 2006-01-01 02:00:00 72 64.64 model 1
## 4 Aston.Hill 2006-01-01 03:00:00 72 64.46 model 1
## 5 Aston.Hill 2006-01-01 04:00:00 70 64.88 model 1
## 6 Aston.Hill 2006-01-01 05:00:00 66 65.80 model 1

For example, to compare model performance at one receptor for multiple models:

TaylorDiagram(subset(modTest, site == "Harwell"), obs = "o3", mod = "mod",
              group = "group")

It is also possible to combine different groups of model results. For example, rather than plot how the models perform at a single site it can be useful to show how they compare at all sites. To do this it is necessary to normalise the data because there will be different values of the observed variable across different sites. In this case we can supply the option group = c("group", "site"). This will show the variation by model for all sites. The results are shown in below. These results show that in general models tend to predict in similarly good (or bad) ways across all sites as shown by the grouping of points on below.

TaylorDiagram(modTest, obs = "o3", mod = "mod", group = c("group", "site"), 
              normalise = TRUE, cex = 1, pch = c(15:19, 15:18))

5 Other openair developments

5.1 Changes in openair 1.6

  • Don’t force integer results for importAURN
  • fix bug with period = "months" in summaryPlot
  • allow linear fit with method = "hexbin" in scatterPlot
  • allow users to define own map limits in trajectory functions and better scaled map grids
  • fix renaming bug in airbaseStats
  • bug in windRose when all calm
  • date bug fixes in scatterPlot
  • add option windflow to scatterPlot and timePlot to allow wind flow plots
  • suppress harmless warnings in smoothTrend
  • fix bug in some functions when type more than one
  • add support in pollutionRose for option normalise to show probability by wind sector (0 to 1).
  • timeAverage now has an option type similar to other functions. A common use would be to apply timeAverage to a data frame with multiple sites where there is a column representing site name e.g. type = "site".
  • Add receptor location for trajectory plots trajLevel and trajPlot.
  • Add an option trend to TheilSen to control how the trend lines are drawn.
  • fix bug in calendarPlot when partial month available
  • fix bug in calendarPlot, don’t need to cut data first
  • add npoints option to trajPlot to control time spacing of dots shown on back trajectories
  • don’t include missings when statistic = "frequency in timeAverage
  • fix bug in timeProp due to point above
  • fix bug in timeVariation with type = "season" when space in pollutant name

5.2 Brief examples of other changes

An example of the change to TheilSen is shown below where the user can now provide a trend list describing the types of line to use.

TheilSen(mydata, pollutant = "o3", deseason = TRUE, 
         trend = list(lwd = c(4, 2), 
                      lty = c(1, 3), 
                      col = c("darkorange", "red")),
         ylab = "o3 (ug/m3)")
## [1] "Taking bootstrap samples. Please wait."

It is now possible to add a linear trend line to scatterPlot when method = "hexbin".

scatterPlot(mydata, x = "nox", y = "pm25", method = "hexbin", linear = TRUE,
            col = "increment", type = "weekend", ylim = c(0, 150))

Users can now set specific latitude/longitude limits in trajectory plotting functions using xlim and ylim.

## import some data
traj <- importTraj(site = "london", year = 2009)
trajPlot(selectByDate(traj, start = "20/1/2009", end = "31/1/2009"), 
         method = "density", col = "increment", xlim = c(-40, 20))
## (loaded the KernSmooth namespace)