by Mark R Payne
DTU-Aqua, Kgs. Lyngby, Denmark
http://www.staff.dtu.dk/mpay

Fri Feb 9 11:08:40 2018


This notebook describes how to extract pointwise data from a NetCDF file using the raster package. As the working example, we use the HadISST v1.1 dataset, which is available from the UK Met Office’s Hadley Centre here: https://www.metoffice.gov.uk/hadobs/hadisst/

To get started, download the NetCDF version of the file (click on the “Download Data” button, then scroll all the way to the bottom of the next page) and decompress the file (e.g. using WinZip or 7-Zip on windows). You will also need to make sure that you have the “raster”" package installed, and potentially also the “ncdf4” as well, which you can install as follows:

install.packages("raster")
install.packages("ncdf4")

Setting up the raster brick

The raster package has three basic types of data structures:

Raster can also support higher dimensional data, up to around four dimensions, and can interface to high dimensionality data by focussing on one aspect of it at a time.

In this case we want a brick, which we create as follows

library(raster)
## Loading required package: sp
fname <- "~/Documents/common_data/HadISST_sst.nc" #You will need to specify your location here
HadISST.b <- brick(fname)  
## Loading required namespace: ncdf4

Lets have a look and see what it contains:

HadISST.b
## class       : RasterBrick 
## dimensions  : 180, 360, 64800, 1776  (nrow, ncol, ncell, nlayers)
## resolution  : 1, 1  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : /home/mpayne/Documents/common_data/HadISST_sst.nc 
## names       : X1870.01.16, X1870.02.14, X1870.03.16, X1870.04.15, X1870.05.16, X1870.06.16, X1870.07.16, X1870.08.16, X1870.09.16, X1870.10.16, X1870.11.16, X1870.12.16, X1871.01.16, X1871.02.15, X1871.03.16, ... 
## Date        : 1870-01-16, 2017-12-16 (min, max)
## varname     : sst

So from this we can see that we have a 1x1 degree resolution grid, that runs from -180 to +180 in the longtitudinal direction, from -90 to 90 in the latitudinal i.e. it’s global. The time steps are stored as “Date” and run from 1870-01-16 to 2017-12-16. We have have a look at the contents using plot

plot(HadISST.b)

which makes for lots of maps - one for each month. We can focus on a single month like so:

plot(HadISST.b[[1]])

But where is the data?

Filtering out values

There is clearly something strange with this - there is no texture in the plot. The reason is because this data set uses -100 to indicate missing values i.e. land or an absence of data. So the first logical thing to do is to conver these to NAs, so that they aren’t plotted. This works much like it would with a standard matrix:

#For simplicity, we'll just work with a single layer
r <- HadISST.b[[1776]]
r[r < -300] <- NA

#Now lets plot it.
plot(r)

That looks a lot better!! Now we can see, for example, that the tropics are hot and the poles are cold. Who would have though?!!

Subsetting

Next step is to focus things down on an area of interest. The “crop()” function is your friend here:

#An extent object defines a box .e.g a region of interest
ROI <- extent(-20,20,45,65)
ROI
## class       : Extent 
## xmin        : -20 
## xmax        : 20 
## ymin        : 45 
## ymax        : 65
#We can use this ROI to crop the raster down to the area that we are interested in - in this case Europe
r.crop <- crop(r,ROI)
plot(r.crop)

Extraction

Next, we are interested in extracting the data - “extract()”" is your friend here. As an example, lets create a series of points across the North Sea, where we want to take a transect:

lon.pts <- seq(-20,20,by=0.5)
lat.pts <- rep(55,length(lon.pts))
plot(r.crop)
points(lon.pts,lat.pts,pch=4,col="red")

Then we can simply use the extract function to take the values out by linear interpolation:

extract.pts <- cbind(lon.pts,lat.pts)
ext <- extract(r.crop,extract.pts,method="bilinear")
ext
##  [1] 10.667289 10.715486 10.763683 10.811880 10.862789 10.913699 10.977364
##  [8] 11.041029 11.129985 11.218941 11.253525 11.288109 11.267390 11.246671
## [15] 11.200770 11.154869 11.087377 11.019885 10.966825 10.913765 10.860334
## [22] 10.806902 10.724209 10.641516 10.525039 10.408563 10.332603 10.256643
## [29] 10.450501 10.644360 10.754722 10.865085 10.756175 10.647264        NA
## [36]        NA        NA  8.870197  8.888687  8.907177  8.894884  8.882591
## [43]  8.887417  8.892243  8.927680  8.963117  8.963781  8.964445  8.913791
## [50]  8.863136  8.777451  8.691766  8.520755  8.349744  8.031960  7.714177
## [57]  7.332975  6.951773  6.615466  6.279159  6.304304  6.329449  6.435112
## [64]  6.540776  6.638148  6.735520  6.564049  6.392578  6.270893  6.149207
## [71]  6.037621  5.926036  5.861358  5.796680  5.747024  5.697369  5.748403
## [78]  5.799438  5.885521  5.971605        NA

Lets see how this looks

plot(lon.pts,ext,type="b",pch=2,xlab="Longitude",ylab="SST")

The extract function also works with bricks as well, so that you can extract from multiple layers at the same time e.g.

b <- HadISST.b[[1765:1776]]  #Most recent year
ann.extract <- extract(b,extract.pts,method="bilinear")
head(ann.extract)
##      X2017.01.16 X2017.02.16 X2017.03.16 X2017.04.16 X2017.05.16
## [1,]    9.945716    9.862325    9.735572    10.03991    10.98126
## [2,]   10.055702    9.968779    9.820142    10.13135    11.08461
## [3,]   10.164053   10.060025    9.889266    10.20940    11.16395
## [4,]   10.272404   10.151272    9.958390    10.28745    11.24329
## [5,]   10.376555   10.229850   10.011624    10.34632    11.29270
## [6,]   10.480707   10.308427   10.064857    10.40519    11.34211
##      X2017.06.16 X2017.07.16 X2017.08.16 X2017.09.16 X2017.10.16
## [1,]    12.45001    13.28287    14.13964    13.55118    12.16586
## [2,]    12.50158    13.33351    14.22531    13.61122    12.25972
## [3,]    12.52620    13.39160    14.27519    13.65796    12.32673
## [4,]    12.55083    13.44968    14.32507    13.70470    12.39374
## [5,]    12.54530    13.51821    14.34472    13.74159    12.43483
## [6,]    12.53977    13.58673    14.36437    13.77847    12.47592
##      X2017.11.16 X2017.12.16
## [1,]    11.08448    10.67512
## [2,]    11.14846    10.71549
## [3,]    11.20402    10.76368
## [4,]    11.25958    10.81188
## [5,]    11.30733    10.86279
## [6,]    11.35509    10.91370
#Lets  make a nice plot - to have a legend though you would probably look at doing some fancy stuff with ggplot - but not here...
matplot(lon.pts,ann.extract,type="l",xlab="Longitude",ylab="SST")

That’s about all there is to it. The help functions in the “raster” package are highly recommended, particularly the overview page:

help("raster-package")

I also have some other tutorials that looked at more advanced, and relationed functions here: http://rpubs.com/markpayne


This work by Mark R Payne is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. For details, see http://creativecommons.org/licenses/by-nc-sa/3.0/deed.en_US Basically, this means that you are free to “share” and “remix” for non-commerical purposes as you see fit, so long as you “attribute” me for my contribution. Derivatives can be distributed under the same or similar license.

This work comes with ABSOLUTELY NO WARRANTY or support.