A Spatial Framework for Comparing GCM-Generated Tropical Cyclone Tracks

Sarah E. Strazzo, James B. Elsner, Timothy E. LaRow, Ming Zhao

date()

## [1] "Thu Oct 04 14:16:37 2012"

####Install packages and requires R scripts

require(sp)
require(rgdal)
require(maps)
require(maptools)
require(colorRamps)
require(grid)
require(RColorBrewer)
require(pROC)
require(ggplot2)
require(plyr)

source("track_hexagons.R")

1 Introduction

Following recent improvements in model resolution and physics, global climate models (GCMs) are now being employed to study how tropical cyclone (TC) frequency and intensity may change in the future (Sugi et al. 2002; Bengtsson et al. 2007; Gualdi et al. 2008; Zhao et al. 2009). TCs can be costly events both in terms of loss of life and property. Therefore it is important to understand how these storms may be affected by a changing climate. GCMs have the potential to be valuable tools in this area of research, but before they are used to predict future TC attributes, it is necessary to understand how well they represent reality. How well do model-generated TCs match observed TCs with respect to intensity, frequency, and spatial distribution?

After several earlier attempts at resolving TC-like vortices in coarse resolution GCMs (Manabe 1970; Bengtsson et al. 1982; Haarsma 1993; Bengtsson et al. 1995), enhanced computing capabilities and physical parameterizations have resulted in improved representations of tropical cyclones in models (Vitart et al. 1999; Sugi et al. 2002; Oouchi et al. 2006; Bengtsson et al. 2007; Walsh et al. 2007; LaRow et al. 2008; Zhao et al. 2009). Although these finer resolution GCMs have been able to generate warm core vortices that resemble TCs, the resolution and physics still fall short in some respects. Most notably, the modeled systems are still unable to attain intensities of observed TCs (Emanuel et al. 2008). In fact, Chen et al. (2007) use mesoscale models to demonstrate that a grid spacing of ~1 km may be necessary to resolve the hurricane eye and eyewall. Despite these shortcomings, recent studies reveal some promise in the ability of GCMs to reproduce global TC statistics such as storm counts and seasonal cycle (Camargo et al. 2005; Zhao et al. 2009). For example, Zhao et al. (2009) run a global climate model of 50 km horizontal resolution over the period 1981–2005 and find that overall storm counts match reasonably well with observations, although there is some overprediction in the south and west Pacific and underprediction in the east Pacific.

As resolution and physics continue to improve and new models are developed, intercomparison analyses are necessary to gain insight into model strengths and weaknesses with respect to TCs. For example, Camargo et al. (2005) examine genesis location, TC counts, intensity, and storm lifetime in a statistical analysis of TC-like vortices in three low-resolution GCMs. Their analysis revealed that even these lower resolution models were capable of producing some realistic global TC statistics. With the current suite of higher resolution models, numerous individual modeling groups have presented TC performance statistics for their specific model (Bengtsson et al. 2007; Gualdi et al. 2008; LaRow et al. 2008; Zhao et al. 2009). However, there remains a need for a uniform framework that can be used to compare many models with eachother and with observations. We present a spatial framework, first introduced for use in TC studies by Elsner et al. (2012), that facilitates efficient statistical comparison of model-generated TCs.

Using this spatial framework, we focus here on spatial distributions of TCs in a comparison of the observed global TC climatology with climatological model runs from two different atmospheric GCMs, the Florida State University-Center for Ocean-Atmospheric Predication Studies (FSU-COAPS) spectral model and the Geophysical Fluid Dynamics Laboratory (GFDL) High-Resolution Atmospheric Model (HiRAM). The method applied here provides a uniform framework with which we can easily compare numerous simulated storm attributes. The paper is organized as follows: the observational and model track data are presented in sectioin two, followed by a explanation of spatial framework. Global results for the GFDL model are presented first, followed by a comparison of both models for the North Atlantic Basin only.

2 Data

2.1 Observations

Observational data used in this research come from the International Best Track Archive for Climate Stewardship (IBTrACS, available online at http://www.ncdc.noaa.gov/oa/ibtracs/) (Knapp et al. 2010). For computational purposes, the 6-hourly data have been interpolated to hourly data using the method outlined in Elsner and Jagger (2012). Although IBTrACS includes more than a century's worth of track data, we subset the data from the period 1982–2008. This time period matches the time period over which the models were run. Additionally, because these data were obtained during the satellite era, we have more confidence in their accuracy.

2.2 Model Data

Model-derived track data are obtained from a private website courtesy of the U.S. Climate Variability and Predictability Research Program (CLIVAR) Hurricane Working Group. We use data from two different atmospheric (non-couupled) GCMs. As with the observational data, the modeled track data are provided at 6-hourly intervals and have been interpolated to hourly intervals using the same method. The models and the methods applied by the modeling groups to detect and track TCs are described below.

2.2.1 GFDL HiRAM2.2

We use model-derived track data from the GFDL HiRAM, version 2.2 (Zhao et al. 2012). The model data come from a control simulation forced with prescribed SSTs from the Hadley Centre Global Sea Ice and Sea Surface Temperature (Rayner et al. 2003). The model features a 50 km horizontal resolution and 32 vertical levels. As described by Zhao et al. (2009), TC-like vortices are detected and tracked using an algorithm similar to that used by Vitart et al. (2003). The algorithm searches for a coinciding (within 2 degrees latitude and longitude) relative vorticity maximum at 850 hPa, a sea-level pressure minimum, and a maximum in the 300-500 hPa averaged temperature field. The vortex trajectories were considered TC tracks when the modeled maximum surface winds exceeded 15.2 m/s during at least 2 (not necessarily consecutive) days (Zhao et al. 2009). We use track data from three separate runs of the HiRAM, referred to here as r1, r2, and r3.

2.2.2 FSU-COAPS

In addition to the HiRAM track data, we also consider tracks derived from the FSU-COAPS global spectral model (Cocke and LaRow 2000; LaRow et al. 2008). As with the GFDL HiRAM, the FSU-COAPS model is not coupled and is forced with prescribed SSts from the HadISST dataset. The spectral model has 27 vertical levels and a T126 horizontal resolution, which corresponds to roughly 0.94 degrees. The simulated TC tracks were also obtained using an algorithm based on Vitart et al. (2003). Because global track data are not yet available for this model, FSU-COAPS tracks are compared with observations and HiRAM tracks for the North Atlantic Basin only. Again, we use tracks from three model runs.

3 Methods

For the analysis, we convert both observational and model maximum wind speed measurements to meters per second. Additionally, we consider only those observations that exceed 17 m/s, which corresponds to the 43rd percentile for the global dataset. Because the distributions of maximum windspeed for observed and modeled are similarly shaped (actuallly, I'm not sure how accurate this is….how to compare?), only those modeled track points with maximum windspeeds exceeding the 43rd percentile are used in the global analysis. This corresponds to 11.3 m/s, 11.1 m/s, and 11.4 m/s for the GFDL HiRAM runs r1, r2, and r3 respectively. For the spatial analysis, the data locations are projected onto a planar coordinate system using a Mollweide projection.

load("C:/Documents and Settings/sstrazzo/My Documents/Research/world.interp.v02r01.R")
# load('/Users/jelsner/Desktop/Misc Data/world.interp.v02r01.R')
ib.df = subset(world.interp, Yr > 1981 & Yr < 2009)
ib.df$Wmax = ib.df$Wmax * 0.5144
percentile = length(ib.df$Wmax[ib.df$Wmax <= 17])/length(ib.df$Wmax)
ib.df = subset(ib.df, Wmax >= 17)
ch.i = nrow(ib.df)
rm(world.interp)

ib.sdf = ib.df[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(ib.sdf) = c("lon", "lat")
ll = "+proj=longlat +ellps=WGS84"
proj4string(ib.sdf) = CRS(ll)
moll = "+proj=moll +lon_0=150 +units=km"
ib.sdf = spTransform(ib.sdf, CRS(moll))

3.1 The Spatial Framework

The spatial framework used here comes from Elsner et al. (2012). We first define a common grid of equal area hexagons. Each hexagon has an area of 727.9 thousand kilometers, which is sufficiently small to capture regional variability. Once the hexagonal grid is defined, we overlay the observational and model track data (or, populate the grid with track attribute data).

hpt.i = spsample(ib.sdf, type = "hexagonal", n = 1000, bb = bbox(ib.sdf) * 1.08, 
    offset = c(0, 0))
hpg.i = HexPoints2SpatialPolygons(hpt.i)

Extract the number of hexagons and the area of each hexagon.

np.i = length(hpg.i@polygons)
area.i = hpg.i@polygons[[1]]@area
area.i

## [1] 727859

The area is in square kilometers.

4 Results

4.1 Per hexagon TC counts

As we are interested in comparing observed and modeled spatial distribustion of storms, we calculate cyclone counts within the hexagonal grid framework. Cyclone counts represent the number of storms that “passed through” each hexagon in the spatial grid. For example, a hexagon centered in the western North Pacific may contain hundreds of storms for the period 1982–2008. Figure 1 illustrates the hexagonal framework for the year 2005 in the Gulf of Mexico. We sum the number of cyclones contained in each hexagon and store this information in a polygon data frame for further calculations.

ib.gulf = subset(ib.df, Yr == 2005)
ib.gulf = subset(ib.gulf, lat > 20 & lat < 45)
ib.gulf = subset(ib.gulf, lon > -100 & lon < -80)
ib.Sgulf = ib.gulf[, c("lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(ib.Sgulf) = c("lon", "lat")
proj4string(ib.Sgulf) = CRS(ll)
moll2 = "+proj=moll +lon_0=340"
ib.Sgulf = spTransform(ib.Sgulf, CRS(moll2))
hpt.gulf = spsample(ib.Sgulf, type = "hexagonal", n = 250, offset = c(1, -1))
hpg.gulf = HexPoints2SpatialPolygons(hpt.gulf)
gulfHSPDF = gulf.counts(ib.Sgulf, hpg.gulf)

l1.gulf = list("sp.points", ib.Sgulf, pch = 20, col = "black", cex = 1)
cl.i = map("world", plot = FALSE, wrap = TRUE)
clp.i = map2SpatialLines(cl.i, proj4string = CRS(ll))
clp.gulf = spTransform(clp.i, CRS(moll2))
l2.gulf = list("sp.lines", clp.gulf, lwd = 2, col = "dark gray")
cr = brewer.pal(5, "Reds")
spplot(gulfHSPDF, "tracks.count", col = "white", col.regions = cr, sp.layout = list(l1.gulf, 
    l2.gulf), at = seq(0, 5, 1), xlim = bbox(gulfHSPDF)[1, ], ylim = bbox(gulfHSPDF)[2, 
    ], colorkey = list(space = "bottom", labels = paste(seq(0, 5, 1))), sub = expression("Number of Cyclones"), 
    main = "Observed TCs in the Gulf of Mexico in 2005")

plot of chunk createFigureX

ib.hexid = over(ib.sdf, hpg.i)
ib.sdf$hexid = ib.hexid
ib.ch = ddply(ib.sdf@data, .(hexid), "nrow")
ib.cn = ddply(ib.sdf@data, .(hexid), summarize, cn = length(unique(Sid)))
id.obs = ib.cn$hexid  # for difference mapping later
ID = paste("ID", ib.ch$hexid, sep = "")
ib.chn = data.frame(nhours = ib.ch$nrow, ncyclones = ib.cn$cn)
rownames(ib.chn) = ID
hspdf = SpatialPolygonsDataFrame(hpg.i[ID], ib.chn, match.ID = TRUE)

Repeat using HiRAM data

Using the GCM-generated track data, we follow the same steps from above to calculate per hexagon cyclone counts and cyclone hours.

setwd("~/Dropbox/GCMTrackData")
load("track.df1.RData")
gt.df1 = subset(track.df1, Yr > 1981 & Yr < 2009)
gt.df1$Wmax = gt.df1$Wmax * 0.5144
gt.df1 = subset(gt.df1, Wmax >= quantile(Wmax, percentile))
ch.g = nrow(gt.df1)
gt.sdf = gt.df1[, c("lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", "DWmaxDt")]
coordinates(gt.sdf) = c("lon", "lat")
proj4string(gt.sdf) = CRS(ll)
gt.sdf = spTransform(gt.sdf, CRS(moll))
l1.g1 = list("sp.points", gt.sdf, pch = 20, col = "black", cex = 0.2)
gt.hexid = over(gt.sdf, hpg.i)
gt.sdf$hexid = gt.hexid
gt.ch = ddply(gt.sdf@data, .(hexid), "nrow")
gt.cn = ddply(gt.sdf@data, .(hexid), summarize, cn = length(unique(Sid)))
id.mod1 = gt.cn$hexid  # for difference mapping later
ID = paste("ID", gt.ch$hexid, sep = "")
gt.chn = data.frame(nhours = gt.ch$nrow, ncyclones = gt.cn$cn)
rownames(gt.chn) = ID
hspdfHiRAMr1 = SpatialPolygonsDataFrame(hpg.i[ID], gt.chn, match.ID = TRUE)
rm(track.df1)

load("track.df2.RData")
gt.df2 = subset(track.df2, Yr > 1981 & Yr < 2009)
gt.df2$Wmax = gt.df2$Wmax * 0.5144
gt.df2 = subset(gt.df2, Wmax >= quantile(Wmax, percentile))
ch.g = nrow(gt.df2)
gt.sdf = gt.df2[, c("lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", "DWmaxDt")]
coordinates(gt.sdf) = c("lon", "lat")
proj4string(gt.sdf) = CRS(ll)
gt.sdf = spTransform(gt.sdf, CRS(moll))
l1.g2 = list("sp.points", gt.sdf, pch = 20, col = "black", cex = 0.2)
gt.hexid = over(gt.sdf, hpg.i)
gt.sdf$hexid = gt.hexid
gt.ch = ddply(gt.sdf@data, .(hexid), "nrow")
gt.cn = ddply(gt.sdf@data, .(hexid), summarize, cn = length(unique(Sid)))
id.mod2 = gt.cn$hexid  # for difference mapping later
ID = paste("ID", gt.ch$hexid, sep = "")
gt.chn = data.frame(nhours = gt.ch$nrow, ncyclones = gt.cn$cn)
rownames(gt.chn) = ID
hspdfHiRAMr2 = SpatialPolygonsDataFrame(hpg.i[ID], gt.chn, match.ID = TRUE)
rm(track.df2)

load("track.df3.RData")
gt.df3 = subset(track.df3, Yr > 1981 & Yr < 2009)
gt.df3$Wmax = gt.df3$Wmax * 0.5144
gt.df3 = subset(gt.df3, Wmax >= quantile(Wmax, percentile))
ch.g = nrow(gt.df3)
gt.sdf = gt.df3[, c("lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", "DWmaxDt")]
coordinates(gt.sdf) = c("lon", "lat")
proj4string(gt.sdf) = CRS(ll)
gt.sdf = spTransform(gt.sdf, CRS(moll))
l1.g3 = list("sp.points", gt.sdf, pch = 20, col = "black", cex = 0.2)
gt.hexid = over(gt.sdf, hpg.i)
gt.sdf$hexid = gt.hexid
gt.ch = ddply(gt.sdf@data, .(hexid), "nrow")
gt.cn = ddply(gt.sdf@data, .(hexid), summarize, cn = length(unique(Sid)))
id.mod3 = gt.cn$hexid  # for difference mapping later
ID = paste("ID", gt.ch$hexid, sep = "")
gt.chn = data.frame(nhours = gt.ch$nrow, ncyclones = gt.cn$cn)
rownames(gt.chn) = ID
hspdfHiRAMr3 = SpatialPolygonsDataFrame(hpg.i[ID], gt.chn, match.ID = TRUE)
rm(track.df3)

Map the number of cyclones

With the hexagons set up for both observed and modeled tracks, we can visually compare them. First we map cycle counts:

plot of chunk cycloneHoursPlot

Difference maps (observed - modeled) for the number of cyclones

To facilitate comparison of the observed and modeled TCs, we create difference maps. For cyclone count hexagons, this is accomplished by simply subtracting the values contained in the modeled storm count hexagons from those in the observed storm count hexagons.

# obs.count = data.frame(ID = id.obs, ncyclone=hspdf@data$ncyclones)
obs.count = cbind(id.obs, hspdf@data$ncyclones)
obs.count = data.frame(obs.count)
colnames(obs.count) = c("ID", "ncyclones")

mod.count1 = cbind(id.mod1, hspdfHiRAMr1$ncyclones)
mod.count1 = data.frame(mod.count1)
colnames(mod.count1) = c("ID", "ncyclones")

mod.count2 = cbind(id.mod2, hspdfHiRAMr2$ncyclones)
mod.count2 = data.frame(mod.count2)
colnames(mod.count2) = c("ID", "ncyclones")

mod.count3 = cbind(id.mod3, hspdfHiRAMr3$ncyclones)
mod.count3 = data.frame(mod.count3)
colnames(mod.count3) = c("ID", "ncyclones")

allhex.obs = rep(0, length(hpg.i))
for (i in 1:length(hpg.i)) {
    for (j in 1:length(obs.count[, 1])) {
        if (obs.count$ID[j] == i) 
            (allhex.obs[i] = obs.count$ncyclones[j])
    }
}
allhex.obs = data.frame(allhex.obs)

allhex.mod1 = rep(0, length(hpg.i))
for (i in 1:length(hpg.i)) {
    for (j in 1:length(mod.count1[, 1])) {
        if (mod.count1$ID[j] == i) 
            (allhex.mod1[i] = mod.count1$ncyclones[j])
    }
}
allhex.mod1 = data.frame(allhex.mod1)

allhex.mod2 = rep(0, length(hpg.i))
for (i in 1:length(hpg.i)) {
    for (j in 1:length(mod.count2[, 1])) {
        if (mod.count2$ID[j] == i) 
            (allhex.mod2[i] = mod.count2$ncyclones[j])
    }
}
allhex.mod2 = data.frame(allhex.mod2)

allhex.mod3 = rep(0, length(hpg.i))
for (i in 1:length(hpg.i)) {
    for (j in 1:length(mod.count3[, 1])) {
        if (mod.count3$ID[j] == i) 
            (allhex.mod3[i] = mod.count3$ncyclones[j])
    }
}
allhex.mod3 = data.frame(allhex.mod3)

allhex.diff1 = allhex.obs - allhex.mod1
allhex.diff1 = cbind(allhex.diff1, allhex.obs, allhex.mod1)
colnames(allhex.diff1) = c("Diff", "OBS", "MODEL")
hex.diff1 = subset(allhex.diff1, allhex.diff1$Diff != 0 & allhex.diff1$OBS != 
    allhex.diff1$MODEL)
rownames(hex.diff1) = paste("ID", rownames(hex.diff1), sep = "")
hspdfDiff1 = SpatialPolygonsDataFrame(hpg.i[rownames(hex.diff1)], hex.diff1, 
    match.ID = TRUE)


allhex.diff2 = allhex.obs - allhex.mod2
allhex.diff2 = cbind(allhex.diff2, allhex.obs, allhex.mod2)
colnames(allhex.diff2) = c("Diff", "OBS", "MODEL")
hex.diff2 = subset(allhex.diff2, allhex.diff2$Diff != 0 & allhex.diff2$OBS != 
    allhex.diff2$MODEL)
rownames(hex.diff2) = paste("ID", rownames(hex.diff2), sep = "")
hspdfDiff2 = SpatialPolygonsDataFrame(hpg.i[rownames(hex.diff2)], hex.diff2, 
    match.ID = TRUE)

allhex.diff3 = allhex.obs - allhex.mod3
allhex.diff3 = cbind(allhex.diff3, allhex.obs, allhex.mod3)
colnames(allhex.diff3) = c("Diff", "OBS", "MODEL")
hex.diff3 = subset(allhex.diff3, allhex.diff3$Diff != 0 & allhex.diff3$OBS != 
    allhex.diff3$MODEL)
rownames(hex.diff3) = paste("ID", rownames(hex.diff3), sep = "")
hspdfDiff3 = SpatialPolygonsDataFrame(hpg.i[rownames(hex.diff3)], hex.diff3, 
    match.ID = TRUE)

Now we can map the differences:

plot of chunk differenceMaps

Visually, these results agree well with Zhao et al. (2009). There is a significant area of overpredictiion in the south and west Pacific, and an area of underprediction in the East Pacific.

4.2 ROC Analysis

From the difference maps, we can qualitatively asses model performance. We see that in some regions the model tends to generate too many storms, while in others, it generates fewer storms than are observed. To summarize model performance in a more quantitative manner, we use receiver operator character (ROC) analysis. ROC analysis offers a means of viewing the tradeoff between the false alarm rate and the hit rate of the model, with the fale alarm being the case that the model predicts a storm, but none actually occurs, and a hit being the case that the model correctly predicts a storm. ….more coming later….

(NOTE: The execution of the ROC analysis rely on several additional scripts: track_hexagons.R, which contains several useful functions for calculating per hexagon storm counts by year and for storing the yearly per hexagon storm counts in a matrix; and finally, get.AUC, which calculates the area under the ROC curve (with 95% confidence intervals) for the yearly per hexagon storm counts.)

# source('track_hexagons.R')# obs_counts = get.count.totals(ib.df, hpg.i) GFDL_counts1 =# get.count.totals(gt.df1, hpg.i) GFDL_counts2 = get.count.totals(gt.df2,# hpg.i) GFDL_counts3 = get.count.totals(gt.df3, hpg.i)# all_auc_r1 = get.auc(obs_counts, GFDL_counts1) all_auc_r1 =# cbind(1982:2008, all_auc_r1) colnames(all_auc_r1) = c('year', 'ci_lo',# 'auc', 'ci_hi') write.table(all_auc_r1, file='Gr1AUC_1000.txt')# all_auc_r2 = get.auc(obs_counts, GFDL_counts2) all_auc_r2 =# cbind(1982:2008, all_auc_r2) colnames(all_auc_r2) = c('year', 'ci_lo',# 'auc', 'ci_hi') write.table(all_auc_r2, file='Gr2AUC_1000.txt')# all_auc_r3 = get.auc(obs_counts, GFDL_counts3) all_auc_r3 =# cbind(1982:2008, all_auc_r3) colnames(all_auc_r3) = c('year', 'ci_lo',# 'auc', 'ci_hi') write.table(all_auc_r3, file='Gr3AUC_1000.txt')

plot of chunk ROCPlots

Because the results of the ROC analysis are partially dependent on the area of each cell within the hexagonal lattice, we test these results by varying the “resolution” (bad word?) of our hexagonal grid. The initial grid is obtained using a spatial sampling algorithm using n=1000 sampling points. This resulted in 882 equal-area hexagons. We now sample the data using n=500, n=750, n=1500, and n=2000 points and compare the results.

# hex500.pts = spsample(ib.sdf, type='hexagonal', n=500,# bb=bbox(ib.sdf)*1.08, offset=c(0, 0)) hex500 =# HexPoints2SpatialPolygons(hex500.pts)# obs_counts500 = get.count.totals(ib.df, hex500) GFDL_counts500 =# get.count.totals(gt.df1, hex500) all_auc500 = get.auc(obs_counts500,# GFDL_counts500) all_auc500 = cbind(1982:2008, all_auc500)# colnames(all_auc500) = c('year', 'ci_lo', 'auc', 'ci_hi')# write.table(all_auc500, file='Gr1AUC_500.txt')

# hex750.pts = spsample(ib.sdf, type='hexagonal', n=750,# bb=bbox(ib.sdf)*1.08, offset=c(0, 0)) hex750 =# HexPoints2SpatialPolygons(hex750.pts)# obs_counts750 = get.count.totals(ib.df, hex750) GFDL_counts750 =# get.count.totals(gt.df1, hex750) all_auc750 = get.auc(obs_counts750,# GFDL_counts750) all_auc750 = cbind(1982:2008, all_auc750)# colnames(all_auc750) = c('year', 'ci_lo', 'auc', 'ci_hi')# write.table(all_auc750, file='Gr1AUC_750.txt')

# hex1500.pts = spsample(ib.sdf, type='hexagonal', n=1500,# bb=bbox(ib.sdf)*1.08, offset=c(0, 0)) hex1500 =# HexPoints2SpatialPolygons(hex1500.pts)# obs_counts1500 = get.count.totals(ib.df, hex1500) GFDL_counts1500 =# get.count.totals(gt.df1, hex1500) all_auc1500 = get.auc(obs_counts1500,# GFDL_counts1500) all_auc1500 = cbind(1982:2008, all_auc1500)# colnames(all_auc1500) = c('year', 'ci_lo', 'auc', 'ci_hi')# write.table(all_auc1500, file='Gr1AUC_1500.txt')

# hex2000.pts = spsample(ib.sdf, type='hexagonal', n=2000,# bb=bbox(ib.sdf)*1.08, offset=c(0, 0)) hex2000 =# HexPoints2SpatialPolygons(hex2000.pts)# obs_counts2000 = get.count.totals(ib.df, hex2000) GFDL_counts2000 =# get.count.totals(gt.df1, hex2000) all_auc2000 = get.auc(obs_counts2000,# GFDL_counts2000) all_auc2000 = cbind(1982:2008, all_auc2000)# colnames(all_auc2000) = c('year', 'ci_lo', 'auc', 'ci_hi')# write.table(all_auc2000, file='Gr1AUC_2000.txt')

plot of chunk ROCPlotsSensitivity

4.3 Model comparison for the north Atlantic Basin: FSU-COAPS and GFDL HiRAM

We next compare the GFDL HiRAM simulated tracks with those generated from the FSU-COAPS spectral model. Because global tracks are not yet available for the FSU-COAPS model, we focus our comparison on the North Atlantic Basin.

load("C:/Documents and Settings/sstrazzo/My Documents/Research/world.interp.v02r01.R")
# load('G:Research/world.interp.v02r01.R')
ib.atl = subset(world.interp, Yr > 1981 & Yr < 2009)
rm(world.interp)
ib.atl = subset(ib.atl, lat > 0 & lat < 65)
ib.atl = subset(ib.atl, lon > -102 & lon < -4)
ib.atl$Wmax = ib.atl$Wmax * 0.5144
percentile.atl = length(ib.atl$Wmax[ib.atl$Wmax <= 17])/length(ib.atl$Wmax)
ib.atl = subset(ib.atl, Wmax >= 17)

ib.sATL = ib.atl[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(ib.sATL) = c("lon", "lat")
ll = "+proj=longlat +ellps=WGS84"
proj4string(ib.sATL) = CRS(ll)
moll.atl = "+proj=moll +lon_0=360 +units=km"
ib.sATL = spTransform(ib.sATL, CRS(moll.atl))

hpt.atl = spsample(ib.sATL, type = "hexagonal", n = 500, bb = bbox(ib.sATL) * 
    1.08, offset = c(0, 0), iter = 100)
hpg.atl = HexPoints2SpatialPolygons(hpt.atl)

ib.idATL = over(ib.sATL, hpg.atl)
ib.sATL$hexid = ib.idATL
ib.ch = ddply(ib.sATL@data, .(hexid), "nrow")
ib.ch = ib.ch[!is.na(ib.ch$hexid), ]
ib.cn = ddply(ib.sATL@data, .(hexid), summarize, cn = length(unique(Sid)))
ib.cn = ib.cn[!is.na(ib.cn$hexid), ]
id.obsATL = ib.cn$hexid  # for difference mapping later
ID = paste("ID", ib.ch$hexid, sep = "")
ib.chn = data.frame(nhours = ib.ch$nrow, ncyclones = ib.cn$cn)
rownames(ib.chn) = ID
hspdf.atl = SpatialPolygonsDataFrame(hpg.atl[ID], ib.chn, match.ID = TRUE)

## repeat above for FSU tracks (r1, r2, & r3)
load("fsu_r1i1p1_ATL.RData")
fsu.r1 = fsuTracks.df
rm(fsuTracks.df)
fsu.r1 = subset(fsu.r1, Yr > 1981 & Yr < 2009)
fsu.r1$Wmax = fsu.r1$Wmax * 0.5144
fsu.r1 = subset(fsu.r1, Wmax >= quantile(Wmax, percentile.atl))
fsu.sdf1 = fsu.r1[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(fsu.sdf1) = c("lon", "lat")
proj4string(fsu.sdf1) = CRS(ll)
fsu.sdf1 = spTransform(fsu.sdf1, CRS(moll.atl))

fsu.hexid1 = over(fsu.sdf1, hpg.atl)
fsu.sdf1$hexid = fsu.hexid1
fsu.ch = ddply(fsu.sdf1@data, .(hexid), "nrow")
fsu.ch = fsu.ch[!is.na(fsu.ch$hexid), ]
fsu.cn = ddply(fsu.sdf1@data, .(hexid), summarize, cn = length(unique(Sid)))
fsu.cn = fsu.cn[!is.na(fsu.cn$hexid), ]
id.fsu1 = fsu.cn$hexid  # for difference mapping later
ID = paste("ID", fsu.ch$hexid, sep = "")
fsu.chn = data.frame(nhours = fsu.ch$nrow, ncyclones = fsu.cn$cn)
rownames(fsu.chn) = ID
hspdf.fsu1 = SpatialPolygonsDataFrame(hpg.atl[ID], fsu.chn, match.ID = TRUE)

## repeat above for FSU tracks (r1, r2, & r3)
load("fsu_r2i1p1_ATL.RData")
fsu.r2 = fsuTracks.df
rm(fsuTracks.df)
fsu.r2 = subset(fsu.r2, Yr > 1981 & Yr < 2009)
fsu.r2$Wmax = fsu.r2$Wmax * 0.5144
fsu.r2 = subset(fsu.r2, Wmax >= quantile(Wmax, percentile.atl))
fsu.sdf2 = fsu.r2[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(fsu.sdf2) = c("lon", "lat")
proj4string(fsu.sdf2) = CRS(ll)
fsu.sdf2 = spTransform(fsu.sdf2, CRS(moll.atl))

fsu.hexid2 = over(fsu.sdf2, hpg.atl)
fsu.sdf2$hexid = fsu.hexid2
fsu.ch = ddply(fsu.sdf2@data, .(hexid), "nrow")
fsu.ch = fsu.ch[!is.na(fsu.ch$hexid), ]
fsu.cn = ddply(fsu.sdf2@data, .(hexid), summarize, cn = length(unique(Sid)))
fsu.cn = fsu.cn[!is.na(fsu.cn$hexid), ]
id.fsu2 = fsu.cn$hexid  # for difference mapping later
ID = paste("ID", fsu.ch$hexid, sep = "")
fsu.chn = data.frame(nhours = fsu.ch$nrow, ncyclones = fsu.cn$cn)
rownames(fsu.chn) = ID
hspdf.fsu2 = SpatialPolygonsDataFrame(hpg.atl[ID], fsu.chn, match.ID = TRUE)

## repeat above for FSU tracks (r1, r2, & r3)
load("fsu_r3i1p1_ATL.RData")
fsu.r3 = fsuTracks.df
rm(fsuTracks.df)
fsu.r3 = subset(fsu.r3, Yr > 1981 & Yr < 2009)
fsu.r3$Wmax = fsu.r3$Wmax * 0.5144
fsu.r3 = subset(fsu.r3, Wmax >= quantile(Wmax, percentile.atl))
fsu.sdf3 = fsu.r3[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", "Wmax", 
    "DWmaxDt")]
coordinates(fsu.sdf3) = c("lon", "lat")
proj4string(fsu.sdf3) = CRS(ll)
fsu.sdf3 = spTransform(fsu.sdf3, CRS(moll.atl))

fsu.hexid3 = over(fsu.sdf3, hpg.atl)
fsu.sdf3$hexid = fsu.hexid3
fsu.ch = ddply(fsu.sdf3@data, .(hexid), "nrow")
fsu.ch = fsu.ch[!is.na(fsu.ch$hexid), ]
fsu.cn = ddply(fsu.sdf3@data, .(hexid), summarize, cn = length(unique(Sid)))
fsu.cn = fsu.cn[!is.na(fsu.cn$hexid), ]
id.fsu3 = fsu.cn$hexid  # for difference mapping later
ID = paste("ID", fsu.ch$hexid, sep = "")
fsu.chn = data.frame(nhours = fsu.ch$nrow, ncyclones = fsu.cn$cn)
rownames(fsu.chn) = ID
hspdf.fsu3 = SpatialPolygonsDataFrame(hpg.atl[ID], fsu.chn, match.ID = TRUE)

## repeat above for GFDL tracks (r1, r2, & r3)
load("gfdlTracks_r1.df.RData")
gfdl.na1 = natltracks.df1
rm(natltracks.df1)
gfdl.na1 = subset(gfdl.na1, Yr > 1981 & Yr < 2009)
gfdl.na1$Wmax = gfdl.na1$Wmax * 0.5144
gfdl.na1 = subset(gfdl.na1, Wmax >= quantile(Wmax, percentile.atl))
gfdl.sna1 = gfdl.na1[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", 
    "Wmax", "DWmaxDt")]
coordinates(gfdl.sna1) = c("lon", "lat")
proj4string(gfdl.sna1) = CRS(ll)
gfdl.sna1 = spTransform(gfdl.sna1, CRS(moll.atl))

gfdl.hexid1 = over(gfdl.sna1, hpg.atl)
gfdl.sna1$hexid = gfdl.hexid1
gfdl.ch = ddply(gfdl.sna1@data, .(hexid), "nrow")
gfdl.ch = gfdl.ch[!is.na(gfdl.ch$hexid), ]
gfdl.cn = ddply(gfdl.sna1@data, .(hexid), summarize, cn = length(unique(Sid)))
gfdl.cn = gfdl.cn[!is.na(gfdl.cn$hexid), ]
id.gfdl1 = gfdl.cn$hexid  # for difference mapping later
ID = paste("ID", gfdl.ch$hexid, sep = "")
gfdl.chn = data.frame(nhours = gfdl.ch$nrow, ncyclones = gfdl.cn$cn)
rownames(gfdl.chn) = ID
hspdf.na.gfdl1 = SpatialPolygonsDataFrame(hpg.atl[ID], gfdl.chn, match.ID = TRUE)

## repeat above for GFDL tracks (r1, r2, & r3)
load("gfdlTracks_r2.df.RData")
gfdl.na2 = natltracks.df1
rm(natltracks.df1)
gfdl.na2 = subset(gfdl.na2, Yr > 1981 & Yr < 2009)
gfdl.na2$Wmax = gfdl.na2$Wmax * 0.5144
gfdl.na2 = subset(gfdl.na2, Wmax >= quantile(Wmax, percentile.atl))
gfdl.sna2 = gfdl.na2[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", 
    "Wmax", "DWmaxDt")]
coordinates(gfdl.sna2) = c("lon", "lat")
proj4string(gfdl.sna2) = CRS(ll)
gfdl.sna2 = spTransform(gfdl.sna2, CRS(moll.atl))

gfdl.hexid2 = over(gfdl.sna2, hpg.atl)
gfdl.sna2$hexid = gfdl.hexid2
gfdl.ch = ddply(gfdl.sna2@data, .(hexid), "nrow")
gfdl.ch = gfdl.ch[!is.na(gfdl.ch$hexid), ]
gfdl.cn = ddply(gfdl.sna2@data, .(hexid), summarize, cn = length(unique(Sid)))
gfdl.cn = gfdl.cn[!is.na(gfdl.cn$hexid), ]
id.gfdl2 = gfdl.cn$hexid  # for difference mapping later
ID = paste("ID", gfdl.ch$hexid, sep = "")
gfdl.chn = data.frame(nhours = gfdl.ch$nrow, ncyclones = gfdl.cn$cn)
rownames(gfdl.chn) = ID
hspdf.na.gfdl2 = SpatialPolygonsDataFrame(hpg.atl[ID], gfdl.chn, match.ID = TRUE)

## repeat above for GFDL tracks (r1, r2, & r3)
load("gfdlTracks_r3.df.RData")
gfdl.na3 = natltracks.df1
rm(natltracks.df1)
gfdl.na3 = subset(gfdl.na3, Yr > 1981 & Yr < 2009)
gfdl.na3$Wmax = gfdl.na3$Wmax * 0.5144
gfdl.na3 = subset(gfdl.na3, Wmax >= quantile(Wmax, percentile.atl))
gfdl.sna3 = gfdl.na3[, c("name", "lon", "lat", "Sid", "Sn", "Yr", "Mo", "Da", 
    "Wmax", "DWmaxDt")]
coordinates(gfdl.sna3) = c("lon", "lat")
proj4string(gfdl.sna3) = CRS(ll)
gfdl.sna3 = spTransform(gfdl.sna3, CRS(moll.atl))

gfdl.hexid3 = over(gfdl.sna3, hpg.atl)
gfdl.sna3$hexid = gfdl.hexid3
gfdl.ch = ddply(gfdl.sna3@data, .(hexid), "nrow")
gfdl.ch = gfdl.ch[!is.na(gfdl.ch$hexid), ]
gfdl.cn = ddply(gfdl.sna3@data, .(hexid), summarize, cn = length(unique(Sid)))
gfdl.cn = gfdl.cn[!is.na(gfdl.cn$hexid), ]
id.gfdl3 = gfdl.cn$hexid  # for difference mapping later
ID = paste("ID", gfdl.ch$hexid, sep = "")
gfdl.chn = data.frame(nhours = gfdl.ch$nrow, ncyclones = gfdl.cn$cn)
rownames(gfdl.chn) = ID
hspdf.na.gfdl3 = SpatialPolygonsDataFrame(hpg.atl[ID], gfdl.chn, match.ID = TRUE)

plot of chunk PlotCycloneCountsNATL

From this figure it is apparent that there are some discrepancies between the FSU-COAPs model and observations. The FSU-COAPS model appears to generate more storms in the southwestern portion of the north Atlantic. Figure Xb also suggests that most of these model-generated storms recurve to the north before impacting the United States. This is consistent with LaRow et al. (2008) who attribute this to the model's large-scale steering flow during the first half of the hurricane season. The framework also allows us to directly compare the FSU-COAPS model with the HiRAM. In contrast to the FSU-COAPS model, which tends to overgenerate storms in the north Atlantic, the GFDL HiRAM does not generate as many storms as are observed. The spatial pattern of the HiRAM storms is more consistent with observations, although there are fewer modeled storms in the Gulf of Mexico and Caribbean than are observed. This latter point is also true of the FSU-COAPS model.

Obs = hspdf.atl
Mod = hspdf.fsu1
Mod2 = hspdf.fsu2
Mod3 = hspdf.fsu3

idmatch = row.names(Obs@data) %in% row.names(Mod@data)
idmatch2 = row.names(Obs@data) %in% row.names(Mod2@data)
idmatch3 = row.names(Obs@data) %in% row.names(Mod3@data)
O.df = Obs@data[idmatch, ]
O.df2 = Obs@data[idmatch2, ]
O.df3 = Obs@data[idmatach3, ]

## Error: object 'idmatach3' not found


idmatchM = row.names(Mod@data) %in% row.names(Obs@data)
idmatchM2 = row.names(Mod2@data) %in% row.names(Obs@data)
idmatchM3 = row.names(Mod3@data) %in% row.names(Obs@data)
M.df = Mod@data[idmatchM, ]
M.df2 = Mod2@data[idmatchM2, ]
M.df3 = Mod3@data[idmatchM3, ]

IDs = row.names(Obs@data)[idmatch]
Merge = Obs[IDs, ]
Merge$Mnhours = M.df$nhours
Merge$Mncyclones = M.df$ncyclones

cor(Merge$ncyclones, Merge$Mncyclones)

## [1] 0.4713

TObs = sum(Merge$ncyclones)
TMod = sum(Merge$Mncyclones)
RateObs = Merge$ncyclones/TObs
RateMod = Merge$Mncyclones/TMod

If we consider only those hexagons which contain storm data for both the FSU-COAPS model and observations, we can compute some useful descriptive statistics. For example, we compute the correlation between between the modeled number of cyclones and observed number of cyclones for the hexagonal grid. For the north Atlantic basin, the correlation is about 0.47.

Using this common grid, we can also examine the factor by which the modeled storm frequency exceeds that of the observed frequency. These values are shown in Figure ?.

blue2white2red = colorRampPalette(c("blue", "white", "red"), space = "Lab")
Merge$fac = RateMod/RateObs
Merge$logFac = log2(Merge$fac)

spplot(Merge, "logFac", col = "white", col.regions = blue2white2red(8), sp.layout = list(l2.atl), 
    at = seq(-6, 6, 1.5), colorkey = list(space = "bottom", labels = paste(seq(-6, 
        6, 1.5))), sub = expression("Log of Relative Ratio (Mod/Obs)"), main = "Storm Frequency Exceedence Factor: FSU/Obs")

plot of chunk unnamed-chunk-2

Color ramp that diverges at 1.