A Bayesian representation of the storm attack probability based on operational track forecast errors

Namyoung Kang

Abstract

This study provides fundamental reviews on the forecast errors of tropical storm tracks, and suggests a Bayesian procedure for updating the uncertainty. The forecast track errors are assumed to form an axisymmetric bivariate normal distribution on two-dimensional surface. The parameters are two means and covariance matrix, which imply the accuracy and inverse precision of the operational forecast. A Bayesian method is improved to update the posterior probability of the varying parameters of the bivariate normal distribution. Normal-inverse-Wishart distribution is employed to determine the posterior, i.e., the weights on the parameters. Based on the posterior of the parameters, the predictive probability density of track forecast errors are obtained as the marginal distribution. Here, ‘storm attack’ is defined by the inclusion of a specific location within the radius of a tropical storm. Consequently, the storm attack probability for a location is derived through partial integration of the marginal distribution within the radius. The storm attack probability is considered a realistic and effective representation of storm warning for local residents, since the location-specific interpretation is available at the same level of official forecast.

Figure 1a

rm(list=ls())

#Fig1a. Histogram 
#------------------------------------------------
data=read.csv('Data/data.csv',T)

par(mfrow=c(1,1),mar=c(4,5.5,2,2))

D=data[(data$t==72 & data$fYear>=2016 & data$fYear<=2018),] ; D=D[!is.na(D$bearing2),]
H=hist(D$dist,prob=T
   ,breaks=seq(0,800,50) #binwidth 50km
   ,xlim=c(0,800),ylim=c(0,0.004),xlab='',ylab='',main='',axes=FALSE) 
d=density(D$dist,from=0,to=800,width=100) #bandwidth 100 km
lines(d,col='dark grey',lwd=2)
q0.7=quantile(D$dist,0.7) ; q0.7 #255km (70 % prob circle radius)

##     70% 
## 254.903

abline(v=q0.7,lwd=2,lty=1,col=4)

axis(1,at=seq(0,800,200), labels=seq(0,800,200),cex.axis=1)
axis(2,at=seq(0,0.004,0.001), labels=seq(0,0.004,0.001),cex.axis=1,cex.axis=1,las=1)
mtext('Error distance (km)',side=1,line=2.7,cex=1.3) 
mtext('Density',side=2,line=4,cex=1.3)
mtext('(a)',side=3,line=0.8,cex=1.5,adj=-0.1)

#(density from histogram)
#x=H$mids ;c=H$counts; d=H$density
#points(x,d,cex=1.5,pch=16,col=1); lines(x,d,lwd=2,col=1)
#(density / 2*pi*r)
#d2=d/(2*pi*x) ; d2=max(d)*d2/max(d2) #scaling to counts
#points(x,d2,cex=1.5,pch=16,col=2); lines(x,d2,lwd=2,col=2)

Figure 1b

#Fig1b. Diagram
#------------------------------------------------
require(png)

## Loading required package: png

require(grid)

## Loading required package: grid

img <- readPNG('Fig1b.png')
grid.raster(img)

Fig. 1. Distribution of track forecast errors. (a) Histogram of error distances over the 3-yr period (2016–2018), and (b) schematic diagram of the error distribution around the forecast track position. For (a), Gaussian kernel smoother is used to delineate the density distribution with 100~km bandwidth (dark grey line), and the 0.7 quantile (255~km) is indicated by a vertical line (blue). (b) depicts the relationship among forecast track position (black dot), mean error (red dot), and specific location (blue). Inner black dashed circle represents the 70~% probability circle radius as indicated by vertical blue line in (a). Area shaded in red shows the two-dimensional error distribution, while the area shaded in blue implies the location-specific storm attack area within the forecast storm radius (blue circle).

Figure 2

rm(list=ls())

library(geosphere) #for ellipsoidal distance, and bearing etc
data=read.csv('Data/data.csv',T)

par(mfrow=c(1,2),mar=c(4,5,2,3))

#Fig2a. Meridional track axis   
#------------------------------------------------
D=data[(data$t==72 & data$fYear>=2016 & data$fYear<=2018),] ; D=D[!is.na(D$bearing2),]
x=D$distLon ; y=D$distLat ; e=D$dist
plot(x,y,type='p',pch=21,cex=0.8,xlim=c(-800,800),ylim=c(-800,800),xlab='',ylab='',axes=FALSE)
abline(h=0,v=0,lty=2,col='dark grey')
points(mean(x),mean(y),pch=16,cex=1.2,col='#df2020')
qntl=quantile(e,prob=0.7)
library(plotrix) ; draw.circle(mean(x),mean(y),qntl,nv=1000,border=4,col=NA,lty=1,lwd=2)
cor.test(x,y)

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 7.7167, df = 456, p-value = 7.596e-14
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2562104 0.4184548
## sample estimates:
##       cor 
## 0.3398588

axis(1,at=seq(-900,900,300), labels=seq(-900,900,300),cex.axis=1)
axis(2,at=seq(-900,900,300), labels=seq(-900,900,300),cex.axis=1,cex.axis=1,las=1)
axis(3,at=seq(-900,900,300), labels=rep('',7))
axis(4,at=seq(-900,900,300), labels=rep('',7))
mtext('Right angle error (km)',side=1,line=2.7,cex=1.3) 
mtext('Meridional error (km)',side=2,line=3,cex=1.3)
mtext('(a)',side=3,line=0.8,cex=1.5,adj=-0.1)

box()

#number of samples
length(x) #==length(y) #458 error samples

## [1] 458

#Lon-Lat lines in dist coord
#grid=seq(-20,20,5)
#for (i in grid)  #lat lines  
#{p1=cbind(rep(0,length(grid)),rep(0,length(grid))) ; p2=cbind(grid,rep(i,length(grid)))
#  d=distVincentyEllipsoid(p1,p2, a=6378137, b=6356752.3142, f=1/298.257223563)  #ellipsoid applied #meter
#  d=d/1000  #meter to kilometer
#  b=bearing(p1,p2, a=6378137, f=1/298.257223563) #bearing
#  lines(cbind(d*sin(b*pi/180),d*cos(b*pi/180)),col=2)
#  lines(cbind(d*cos(b*pi/180),d*sin(b*pi/180)),col=2)
# }

#Fig2b. Along track axis   
#------------------------------------------------
D=data[(data$t==72 & data$fYear>=2016 & data$fYear<=2018),] ; D=D[!is.na(D$bearing2),]
x=D$distCross ; y=D$distAlong ; e=D$dist
plot(x,y,type='p',pch=21,cex=0.8,xlim=c(-800,800),ylim=c(-800,800),xlab='',ylab='',axes=FALSE)
abline(h=0,v=0,lty=2,col='dark grey')
points(mean(x),mean(y),pch=16,cex=1.2,col='#df2020')
qntl=quantile(e,prob=0.7)
draw.circle(mean(x),mean(y),qntl,nv=1000,border=4,col=NA,lty=1,lwd=2)
cor.test(x,y)

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 0.67164, df = 456, p-value = 0.5022
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.06036385  0.12271019
## sample estimates:
##        cor 
## 0.03143684

axis(1,at=seq(-900,900,300), labels=seq(-900,900,300),cex.axis=1)
axis(2,at=seq(-900,900,300), labels=seq(-900,900,300),cex.axis=1,cex.axis=1,las=1)
axis(3,at=seq(-900,900,300), labels=rep('',7))
axis(4,at=seq(-900,900,300), labels=rep('',7))
mtext('Cross track error (km)',side=1,line=2.7,cex=1.3) 
mtext('Along track error (km)',side=2,line=3,cex=1.3)
mtext('(b)',side=3,line=0.8,cex=1.5,adj=-0.1)

box()

#Lon-Lat lines in dist coord
#grid=seq(-20,20,5)
#for (i in grid)  #lat lines  
#{p1=cbind(rep(0,length(grid)),rep(0,length(grid))) ; p2=cbind(grid,rep(i,length(grid)))
#  d=distVincentyEllipsoid(p1,p2, a=6378137, b=6356752.3142, f=1/298.257223563)  #ellipsoid applied #meter
#  d=d/1000  #meter to kilometer
#  b=bearing(p1,p2, a=6378137, f=1/298.257223563) #bearing
#  lines(cbind(d*sin(b*pi/180),d*cos(b*pi/180)),col=2)
#  lines(cbind(d*cos(b*pi/180),d*sin(b*pi/180)),col=2)
# }

Fig. 2. Two-dimensional distribution of track forecast errors over the 3-yr period (2016–2018). Errors are displayed in (a) Meridional-right angle coordinates, and (b) Along-cross track coordinates. Mean error positions and 70~% probability circle radii are indicated by red dots and blue circles.

Figure 3

rm(list=ls())

data=read.csv('Data/data.csv',T)
D=data[(data$t==72 & data$fYear>=2016 & data$fYear<=2018),] ; D=D[!is.na(D$bearing2),]
xo1=D$distCross ; xo2=D$distAlong

#Fig3a. Density distribution of parameters
#------------------------------------------------
#(Bayesian updating of the params using the normal-inverse-Wishart distribution)
ref=xo1   #For example, xo1 is used for scaling references #The scaling ref can come from any source  
x1=(xo1-mean(ref))/sd(ref) ; x2=(xo2-mean(ref))/sd(ref); K=2  ; n=length(x1) #scaled x1, x2

library(LaplacesDemon) ;  library(mvtnorm)

## 
## Attaching package: 'mvtnorm'

## The following objects are masked from 'package:LaplacesDemon':
## 
##     dmvt, rmvt

#Set the posterior probabilities of parameters, f(mu, precision) using dnormwishhart func 
#Update params  #The least informative Wishart prior is obtained by setting  ?? = K (#of dimensions)  #set prior scale matrix S=diag(K)
  lambda_post=n ; nu_post=n   #Params for posterior are updated by the conjugacy nature 
  mu0_post = c(mean(x1),mean(x2));S_post= (n-1)*cov(cbind(x1,x2)) 

#Grid configuration for the parameters  #Assumption: geometric axisymmetry, [var(x1)=var(x2)]  & [cov( x1,x2) =0] 
  m1Bin= seq(mean(x1)-0.15,mean(x1)+0.15,0.02); m2Bin=seq(mean(x2)-0.15,mean(x2)+0.15,0.02); CovBin= seq(0.5,1.5,0.04)
  paramsBin=NULL; for(i in m1Bin){for(j in m2Bin){for(k in CovBin) paramsBin=rbind(paramsBin,c(i,j,k))}}
  #Get joint pdf of (mu,Omega) using [dnormwishart] function
  post=NULL ; for(i in 1: dim(paramsBin)[1]) 
  {mu=c(paramsBin[i,1],paramsBin[i,2]) ; Sigma=matrix(c(paramsBin[i,3],0,0,paramsBin[i,3]),nrow=2,ncol=2)
   post=c(post,dnorminvwishart(mu, mu0_post, lambda_post, Sigma, S_post, nu_post, log=FALSE))} ; post=post/sum(post)
   par(mfrow=c(1,2))

#   plot(paramsBin[,1],paramsBin[,2],cex=1.5*post/max(post),main='mu')
#   plot(paramsBin[,3], post,main='Var')  #check

par(mfrow=c(1,2),mar=c(4,5,2,3))
plot(paramsBin[,1],paramsBin[,3],cex=3.4*post/max(post),xlim=c(-0.15,0.16),ylim=c(0.85,1.15),xlab='',ylab='',axes=FALSE)
legend('topleft',c(expression(paste('Maximum (0.87 x 10'^-3,')')), 'Maximum/2'),cex=1,pch=21,pt.cex=c(3.4, 1.7),bg='white'
 ,x.intersp=1, y.intersp=1.5)
  
xlab=seq(-40,40,10) ;xlab2=(xlab-mean(ref))/sd(ref)
vlab=seq(15000,35000,2000) ; vlab2=((sqrt(vlab)-mean(ref))/sd(ref))^2

axis(1, at = xlab2, labels=xlab) 
axis(2, at = vlab2, labels=vlab/1000, las=1)
axis(3,at=xlab2, labels=rep('',length(xlab2)))
axis(4,at=vlab2, labels=rep('',length(vlab2)))
mtext('Mean (km)',side=1,line=2.7,cex=1.3) 
mtext(expression(paste('Variance (10'^3,'km'^2,')')),side=2,line=2.8,cex=1.3)
mtext('(a)',side=3,line=0.8,cex=1.5,adj=-0.1)

box()

#Fig3b. Density distribution of errors
#------------------------------------------------
#(Predictive density from the marginal prob distribution)
#Get the conditional densities of 2-d error distances, f(x1,x2|mu,Sigma) using dmvnorm func 
# & compute marginal densities, f(x1,x2) through joint pdf,  f(mu,Sigma|mu0, lambda, S, nu)*f(x1,x2|mu,Sigma)
#Grid configuration for one direction (values from the storm center) #A resolution is set for calculation efficiency
  extentLimit=2000 ;  resolution=20  ; grid=seq(0,extentLimit,resolution); r=grid/sd(ref)  #extentLimit (km): No calculation over the limit
  center=mu0_post ; x1Bin= center[1]+r ; x2Bin=0  #x1 from the error center (=mu0_post) to the domain boundary (= extentLimit)
  margProfile=NULL ; for ( i in 1:length(x1Bin)) {x=c(x1Bin[i],x2Bin) ; lik=NULL ;  for(j in 1:dim(paramsBin)[1])
     {mean=c(paramsBin[j,1],paramsBin[j,2]); sigma=matrix(c(paramsBin[j,3],0,0,paramsBin[j,3]),nrow=2,ncol=2)
      lik=c(lik,dmvnorm(x,mean,sigma,log=FALSE))} ; margProfile=c(margProfile,sum(post*lik))}
      margSum=sum(margProfile*(2*pi*grid)) ; margProfile=margProfile/margSum  #reassign densities for the sum to be 1.0. 
 # write.table(margProfile,'margProfile.txt') 
 
  plot(grid,margProfile,type='l', xlim=c(0,800) ,axes=F, xlab='', ylab='')
  points(grid,margProfile,pch=16,cex=1.2,col=1)  #NOW WE HAVE THE DENSITY PROFILE OF ERROR POSITION! 
 
  axis(1,at=seq(0,800,200), labels=seq(0,800,200),cex.axis=1)
  axis(2,at=seq(0,0.00015,0.00003), labels=seq(0,1.5,0.3),cex.axis=1,las=1)
  mtext('Distance from the mean error (km)',side=1,line=2.7,cex=1.3) 
  mtext(expression(paste('Probability (10'^-4,')')),side=2,line=3,cex=1.3)
  mtext('(b)',side=3,line=0.8,cex=1.5,adj=-0.1)

  box()

Fig. 3. Results from the Bayesian updating of the error samples. (a) Posterior of the parameters, i.e., mean (along track) and variance, and (b) predictive density distribution of the forecast track errors. In (a), only along track densities are demonstrated for abscissa, and multiple circles show the overlapping probabilities at variable cross track means per each along track mean. Variance is the same for both directions due to the assumption of axisymmetric error distribution. (b) represents a directional profile of the densities at 20~km intervals (dot), implying the half-cross section of the density distribution starting from the mean error position.

Figure 4

#rm(list=ls())
data=read.csv('Data/data.csv',T)
D=data[(data$t==72 & data$fYear>=2016 & data$fYear<=2018),] ; D=D[!is.na(D$bearing2),]
xo1=D$distCross ; xo2=D$distAlong

#Fig4a. D-R table  
#------------------------------------------------
#Take the density profile result  from the preceding section 
  #Set up a distance domain in fine grids (Remember! Here, a distance is from the mean error position)
 
#margProfile=read.table('margProfile.txt',T)[,1]   #center=mu0_post 

#Set up a distance domain in fine grids 
  #Set up a fine grid density profile using spline method 
  grid=seq(0,2000,20) # for margProfile
  gridFine=seq(0,2000,10)  #fine grid at 10 km intervals  #Remeber! It start from the mean error position 
  densFine=spline(grid,margProfile,length(gridFine),method='fmm')$y #refer to R doc for 'fmm' method

#Prepare a probability density set per storm size per distance (from mean error) #The calculation takes several minutes 
  #prepare for an extreme case at a 1000 km distant location with 1000 km storm size 
   resol=10
   grid=seq(0,1000,resol)  #extentLimit/2=1000 km
   dens=densFine/sum(densFine*2*pi*gridFine)   #set for (D+wr) not to exceed extentLimit (2000 km)  
   table=NULL  #Radius & Distance 
   for(D in grid) {for(r in grid[-1]){
         if(r>=D){densSum=NULL; for(dr in seq(resol,D+r,resol)){
                      if (dr<=(r-D)) densSum=sum(densSum,2*pi*dr*dens[1+dr/resol])
                      else {A=(D^2+dr^2-r^2)/(2*D*dr); theta=2*acos(A); densSum=sum(densSum,theta*dr*dens[1+dr/resol])}}}  
         if(D>r){densSum=NULL; for(dr in seq(D-r,D+r,resol)){
                      A=(D^2+dr^2-r^2)/(2*D*dr); theta=2*acos(A); densSum=sum(densSum,theta*dr*dens[1+dr/resol])}}
                      table=rbind(table,c(D,r,densSum))}}
   distance=table[,1] ; radius=table[,2] ; density=table[,3]
   DF=cbind(distance,radius,density)
  #write.table(DF,'RDtable.txt')

#plot matrix
   par(mfrow=c(1,2),mar=c(4,5,2,3))
      
   library(RColorBrewer) ; colist=colorRampPalette(brewer.pal(9,'YlOrRd'))(30)      # extract 10 from Yellow-Orange-Red 
   library(akima); library(RColorBrewer)        #akima for interpolation  #RColorBrewer for color legend 
   A1=distance ; A2=radius ; A3=density    
#   image(interp(A1,A2,A3,xo=seq(min(A1),max(A1),by=10),yo=seq(min(A2),max(A2),by=10))
#     ,axes=F, xlab='', ylab='', breaks=seq(0,max(A3),length=31) 
#     ,col=colist, xlim=c(0,800),ylim=c(0,800))
   plot(-999,-999, axes=F, xlab='', ylab='', xlim=c(0,800),ylim=c(0,800))  
   contour(interp(A1,A2,A3,xo=seq(min(A1),max(A1),by=40),yo=seq(min(A2),max(A2),by=40)),levels=seq(0.1,0.9,0.2)      
      ,lwd=1, add = TRUE, col=1,labcex=0.8)

   axis(1,at=seq(0,1000,200),labels=seq(0,1000,200),cex.axis=1)
   axis(2,at=seq(0,1000,200),labels=seq(0,1000,200),cex.axis=1,las=1)
   axis(3,at=seq(0,1000,200), labels=rep('',6)) 
   axis(4,at=seq(0,1000,200), labels=rep('',6))
   mtext('Distance from the mean (km)',side=1,line=2.7,cex=1.3) 
   mtext('Storm radius (km)',side=2,line=3,cex=1.3)
   mtext('(a)',side=3,line=0.8,cex=1.5,adj=-0.1)

   box()

#Fig4b. storm attack probability (case with storm wind radius = 300 km) 
#------------------------------------------------  
#(Example map of pointwise storm strike probability) 

  wr=300 # Exaple storm radius (km) 
  fcst_position=c(126,30) #Example forecast position (longitude, latitude)   

#Assign densities to lon & lat
#--------------------------------------
  library(geosphere) 
  range=seq(-1000,1000,10)

 #table=read.table('RDtable.txt',T)
  distance=table[,1] ; radius=table[,2] ; density=table[,3]   
  xd1=rep(range,each=length(range)) ; xd2=rep(range,times=length(range))  
  A3=NULL; for(i in 1:length(xd1)) 
        {d=sqrt(xd1[i]^2+xd2[i]^2); k=which(radius==wr) ; k2=which.min(abs( distance[k]-d)) ; A3=c(A3,density[k[k2]])}     

  #Locate Lon & Lat of the mean error position
  distance=sqrt(mean(xo1)^2+mean(xo2)^2) #distance from the fcst position 
  bearing=(180/pi)*atan2(mean(xo1),mean(xo2))   #theta=90-atan2(y,x)  
  error_position=destPoint(fcst_position,bearing,1000*distance, a=6378137, f=1/298.257223563)

  #Assign densities to each Lon & Lat 
  distance=sqrt(xd1^2+xd2^2) #distance from the mean  
  bearing=(180/pi)*atan2(xd1,xd2)   #theta=90-atan2(y,x) 
  DP=destPoint(error_position, bearing, 1000*distance, a=6378137, f=1/298.257223563)
  A1=as.data.frame(DP)$lon ; A2=as.data.frame(DP)$lat

#plot storm attack prob
#--------------------------------------
#plot map
  library(akima) ; library(RColorBrewer)     # akima for interpolation  #RColorBrewer for color legend
  colist=c('white',colorRampPalette(brewer.pal(9,'YlOrRd'))(29))    # extract 30 from Yellow-Orange-Red 
  image(interp(A1,A2,A3,xo=seq(min(A1),max(A1),by=0.2),yo=seq(min(A2),max(A2),by=0.2))
     ,axes=F, xlab='', ylab=''
     ,breaks=seq(0,max(A3),length=31)           # break# should be color# + 1 
     ,col=colist, xlim=c(115,135),ylim=c(22,38))
  contour(interp(A1,A2,A3,xo=seq(min(A1),max(A1),by=0.2),yo=seq(min(A2),max(A2),by=0.2)),levels=seq(0.1,0.9,0.2)      
     ,lwd=1, add = TRUE, col=1,labcex=0.8)  
    
  library(maps)               # for map  
  m=map('world',interior=F,lwd=1, col='dark grey',add=T)
   #  m$x=m$x+360 ; map(m,interior=F, lwd=1, col='dark grey',add=T)                      # plot America on right side of the map.
 
  abline(v=fcst_position[1], h=fcst_position[2],lwd=1,col=1,lty=2)
  points(error_position,pch=16,col='blue',cex=2) 


  axis(1,at=seq(115,140,5), labels=expression  (115~degree~E,120~degree~E,125~degree~E,130~degree~E,135~degree~E,140~degree~E),cex.axis=1) 
  axis(2,at=seq(20,40,5), labels=expression(20~degree~N,25~degree~N,30~degree~N,35~degree~N,40~degree~N),cex.axis=1,las=1)
  axis(3,at=seq(115,140,5), labels=rep('',6)) 
  axis(4,at=seq(20,40,5), labels=rep('',5))
  mtext('Longitude',side=1,line=2.7,cex=1.3) 
  mtext('Latitude',side=2,line=4,cex=1.3)
  mtext('(b)',side=3,line=0.8,cex=1.5,adj=-0.1)

  box()

Fig. 4. Location-specific storm attack probability. (a) probability matrix of the location distance and the storm radius, and (b) an example map of storm attack probability with 300 km forecast storm radius. Labels in the contours represent the probabilities. In (b), a grid point (126\(^{\circ}\)E, 30\(^{\circ}\)N) is set as the forecast position of a storm, shown as the intersection of the two dashed lines. Blue dot indicates the mean error position.

Figure 5

require(png)
require(grid)

img <- readPNG('Fig5.png')
grid.raster(img)

Fig. 5. Schematic diagram of the Bayesian updating and storm attack probability. For the procedure, there are two input sources such as observed new errors, and forecast storm radius. Stepwise interpretations are noted in blue.