Resources from last discussion

Objectives

For class today

Each group member will prepare a brief summary of their two science papers that states at least the following, the question being addressed, the main result of the study, and your critique. You will have about 2 paragraphs per paper.

Place an outline of the additional info that you will cover in the brief presentataion on dropbox. This will consist of several bullets with links to references.

The discussion that follows will touch on at least the following questions:

  1. Describe where are the threats to globally fisheries and why they are occurring

  2. How good are the data? Are there big holes in our understanding?

  3. How might the interaction between fishing and other variables, such as temperature change, affect the results?

  4. What are some of the ways to mitigate the problem of overfishing?

For next time

We will look for interactions in the data involving season using the code below. So you don’t need to jump between files, the first part of this shows the code from last time. Following that is new code for interactions.

The longline data

We have available to us the long-line data used in the Tuna and swordfish catch paper that was included in the discussion. Here is software we will need:

library(corrplot)
library(dendextend)
library(gjam)
library(colorspace)
library('RColorBrewer')
source('clarkFunctions2021.r')

Read in the file here:

longline <- read.csv("LOGBOOK_SWO_ENV_4.csv")
dim(longline)
## [1] 259350    152

This is a large data set and may take a minute to ingest, with 259350 rows. I begin by partitioning this data.frame into two parts, one for predictors and another for responses:

xdata <- longline[,c(1:13,134:140)]     # predictors
ydata <- longline[,104:133]             # responses

The data.frame xdata includes variables related to fishing effort (`HOOKS_SET,

The responses in data.frame ydata include the catch for species.

The spatial extent of the data is shown here.

lonLat    <- xdata[,c('Longitude','Latitude')]
maps::map(xlim=c(-100,-10), ylim=c(0,55), fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], col = .getColor('brown',.03), cex=.2)

Fishing effort for longline data.

You may notice that I used a semi-transparent symbol color so that areas with intense activity show up as darker on the map, rather than simply masking one another.

The data uses the number -9999 to indicate no data. I replace these entries with NA. I then construct a correlation, which helps me to see the relationships between variables:

xdata[xdata == -9999] <- NA
round( cor(xdata[,c(7:20)], use='pairwise.complete.obs'), 2)
##                HOOKS_SET HOOKS_BETW LIGHT_STIC MAINLINE_S GANGION_LE FLOATLINE_
## HOOKS_SET           1.00       0.07      -0.24       0.28      -0.18      -0.15
## HOOKS_BETW          0.07       1.00      -0.08       0.00       0.04       0.06
## LIGHT_STIC         -0.24      -0.08       1.00       0.01       0.03      -0.03
## MAINLINE_S          0.28       0.00       0.01       1.00       0.06       0.00
## GANGION_LE         -0.18       0.04       0.03       0.06       1.00       0.45
## FLOATLINE_         -0.15       0.06      -0.03       0.00       0.45       1.00
## BAIT               -0.11      -0.04       0.27      -0.07      -0.14      -0.17
## SST                -0.18       0.00      -0.09      -0.01       0.11       0.08
## TKE                -0.18       0.00      -0.02      -0.15      -0.11      -0.07
## Bathy              -0.22       0.04       0.02      -0.21      -0.04      -0.03
## AMO                 0.27       0.00      -0.02       0.03      -0.25      -0.32
## NAO                -0.19      -0.02       0.04      -0.01       0.16       0.18
## WNAO               -0.16      -0.01       0.05      -0.01       0.10       0.11
## CHLA_AQUA_8DAY      0.03       0.19      -0.06      -0.15      -0.13      -0.05
##                 BAIT   SST   TKE Bathy   AMO   NAO  WNAO CHLA_AQUA_8DAY
## HOOKS_SET      -0.11 -0.18 -0.18 -0.22  0.27 -0.19 -0.16           0.03
## HOOKS_BETW     -0.04  0.00  0.00  0.04  0.00 -0.02 -0.01           0.19
## LIGHT_STIC      0.27 -0.09 -0.02  0.02 -0.02  0.04  0.05          -0.06
## MAINLINE_S     -0.07 -0.01 -0.15 -0.21  0.03 -0.01 -0.01          -0.15
## GANGION_LE     -0.14  0.11 -0.11 -0.04 -0.25  0.16  0.10          -0.13
## FLOATLINE_     -0.17  0.08 -0.07 -0.03 -0.32  0.18  0.11          -0.05
## BAIT            1.00 -0.10  0.05  0.04  0.07 -0.08 -0.04           0.00
## SST            -0.10  1.00  0.13  0.03  0.02 -0.03 -0.03          -0.26
## TKE             0.05  0.13  1.00  0.07 -0.02 -0.02 -0.01          -0.01
## Bathy           0.04  0.03  0.07  1.00  0.03  0.00  0.00           0.19
## AMO             0.07  0.02 -0.02  0.03  1.00 -0.42 -0.31           0.03
## NAO            -0.08 -0.03 -0.02  0.00 -0.42  1.00  0.48          -0.01
## WNAO           -0.04 -0.03 -0.01  0.00 -0.31  0.48  1.00           0.03
## CHLA_AQUA_8DAY  0.00 -0.26 -0.01  0.19  0.03 -0.01  0.03           1.00

Here is correlation structure in the responses:

cc <- cor(ydata)
diag(cc) <- 0
corrplot(cc, type='lower', diag=F, cl.lim = c(-.2, .75), is.corr = F)

Correlations between fish are mostly low.

Clearly, there are not large correlations between species. Here is a cluster analysis using the hierarchical clustering function hclust and related functions in package dendextend.

cory   <- cor(ydata)
dsigma <- as.dist( cov2Dist(cory) )
yclust <- hclust(dsigma, method='complete')

#save clusters
ncluster <- 6
clusters <- cutree(yclust,k=ncluster) #membership by no. of clusters

dend <- as.dendrogram(yclust)
dend <- color_branches(dend, k=3)
dend <- hang.dendrogram(dend,hang_height=0.1)
dend <- set(dend, "labels_cex", 0.5)
plot(dend, horiz =  TRUE,  nodePar = list(cex = .007))

Dendrogram shows similarity between species.

Here are trends through time for swordfish:

y <- sqrt( ydata[,'Swordfish'] )
plot(jitter( xdata$Year ), y, cex=.2, xlab='Year',ylab='Catch')
spl <- spline(y ~ xdata$Year)
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
lines(spl,lwd=5,col='white')
lines(spl)

Lack of trend in catch data.

Here is catch effort, gauged as number of hooks set:

plot(jitter( xdata$Year ), xdata$HOOKS_SET, cex=.2, 
     xlab='Year',ylab='Hooks set', log='y')
spl <- spline(xdata$HOOKS_SET ~ xdata$Year)
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
lines(spl,lwd=5,col='white')
lines(spl)

Catch effort shows an increase over time.

Here is catch per effort (hooks):

y <- ydata[,'Swordfish']/xdata$HOOKS_SET 
plot(jitter( xdata$Year ), y, cex=.2, log='y',
     xlab='Year',ylab='Catch/Hook')
## Warning in xy.coords(x, y, xlabel, ylabel, log): 61750 y values <= 0 omitted
## from logarithmic plot
spl <- spline(y ~ xdata$Year)
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
lines(spl,lwd=5,col='white')
lines(spl)

Trend in catch per hook.

Here is a map of changing catch location

colors <- c('blue','green','orange','brown')   # color gradient

years <- sort(unique(xdata$Year))
nyr   <- length(years)

ycol <- round(quantile(years, c(0, .33, .67, 1)))

cf   <- colorRampPalette(colors)(nyr)
size <- y/max(y)
colr <- cf[match(xdata$Year, years)]

maps::map(xlim=c(-110,-30), ylim=c(7,55), fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], cex = size^.5, 
       col = colr)
legend('topleft',legend=c(ycol), text.col=colors)

Change in distribution of catch over time.

Season effects

I am concerned about the foregoing analysis because I know that temperature is increasing over time, just as is fishing effort. However, I also know that temperature is changing throughout the year. Perhaps fish are being affected by simply shifting where they live at different times of year, seeking and finding a preferred temperature year round. I think I might identify such an effect by including an interaction between SST and season.

Here are a few additional functions I will use to create a data set that is aggregated by location and season:

lonLatSeasonGrid <- function(variable, xdata, FUN=mean){
  
  # variable - a continuous variable to be averaged in xdata
  # xdata    - must include columns for variable, lon, lat, season
  
  tmp <- suppressWarnings( by(variable, 
                              list(lon=xdata$lon, lat=xdata$lat,
                                   season=xdata$season),
                              FUN, na.rm=T) )
  gridframe <- numeric(0)
  dm <- dim(tmp)
  longrid <- as.numeric(dimnames(tmp)[[1]])
  latgrid <- as.numeric(dimnames(tmp)[[2]])
  ssngrid <- dimnames(tmp)[[3]]
  longrid <- matrix(longrid,dm[1],dm[2])
  latgrid <- matrix(latgrid,dm[1],dm[2], byrow=T)
  
  for(j in 1:dim(tmp)[3]){
    lonj <- as.vector(longrid)
    latj <- as.vector(latgrid)
    ssnj <- rep(ssngrid[j], length(lonj))
    jdat <- data.frame(lon = lonj, lat = latj, season = ssnj, 
                       x = as.vector(tmp[,,j]))
    gridframe <- rbind( gridframe, jdat )
  }
  gridframe
}

.getColor <- function(col,trans){
  
  # trans - transparency fraction [0, 1]
  
  tmp <- col2rgb(col)
  rgb(tmp[1,], tmp[2,], tmp[3,], maxColorValue = 255, 
      alpha = 255*trans, names = paste(col,trans,sep='_'))
}
  
.aggData <- function(cnames, data, gather, FUN){
  
  #cnames - column names in data to operate on
  #gather   - list of indices, all of same length
  
  cnames <- cnames[cnames %in% colnames(data)]
  if(length(cnames) == 0)return( list(data = numeric(0)) )
  
  FAC <- F
  df <- data[,cnames]
  
  if(FUN == 'factor'){
    FAC <- T
    tf <- fnames <- character(0)
    df <- numeric(0)
    
    for(j in 1:length(cnames)){
      kf <- as.character(data[,cnames[j]])
      tf <- cbind(tf, kf)
      fnames <- append(fnames,list(sort(unique(kf))))
      df <- cbind(df, match(kf,fnames[[j]]) )
    }
    colnames(df) <- cnames
    FUN <- 'max'
  }
  
  tmp <- aggregate(df, by=gather, FUN=FUN, na.rm=T)
  
  if(FAC){
    for(j in 1:length(cnames)){
      kf <- fnames[[j]][tmp[,cnames[j]]]
      tmp[,cnames[j]] <- as.factor(kf)
    }
  }
  tmp
}

Read the data again. Here are the columns I want to save:

specNames <- c("Swordfish","Bigeye","Bluefin","Yellowfin","Albacore",
               "Blackfin","Dolphin","Wahoo",  "King","Escolar","Skipjack",
               "Blue","Hammerhe_9","Thresher","Blacktip","Dusky",
               "Scalloped","Smooth","Night","Whitetip","Silky",
               "BEThresher","White","Tiger","Sandbar",
               "MakoLong","MakoShort","WhiteMarli","BlueMarlin","Sailfish")
xnames <- c("FID","Year","Month","Day","Latitude","Longitude",
            "HOOKS_SET","HOOKS_BETW","LIGHT_STIC","MAINLINE_S","GANGION_LE",
            "FLOATLINE_","BAIT",
            "SST","TKE","Bathy","AMO","NAO","WNAO","CHLA_AQUA_8DAY",
            "CHLA_SEAWIFS_8DAY","CHLA_TERRA_8DAY","CS","SM")

xdata <- longline[,xnames]
ydata <- longline[,specNames]

Here is some aggregation by season, longitude, and latitude:

# seasons
snames <- c('JFM','AMJ','JAS','OND')             
season <- findInterval(xdata$Month, seq(1, 12, by=3) )
xdata$season <- as.factor(snames[season])

# lon/lat
cnames <- c("SST","TKE","Bathy","Month")
lon <- round(xdata$Longitude)
lat <- round(xdata$Latitude)

gather <- with(xdata, list(lon = lon, lat = lat, yr = Year, season = season))
xtmp <- .aggData(cnames, xdata, gather, FUN='mean')

# sum effort
enames <- c('HOOKS', 'HOOKS_BETW', 'MAINLINE_S')
etmp <- .aggData(enames, xdata, gather, FUN='sum')

# sum catch
ytmp <- .aggData(colnames(ydata), ydata, gather, FUN='sum')
ytmp <- ytmp[,!colnames(ytmp) %in% names(gather)]

ytmp <- as.matrix(ytmp)
keep <- suppressWarnings( apply(ytmp, 1, max, na.rm=T) )
keep <- which(is.finite(keep))

ydata <- ytmp[keep,]
xdata <- xtmp[keep,]
edata <- etmp[keep,]
rm(longline)

Here’s a map of the aggregated lon/lat data:

years <- sort(unique(xdata$yr))
nyr   <- length(years)

lonLat    <- xdata[,c('lon','lat')]
mapLonLat <- apply(lonLat,2,range)
maps::map(xlim=mapLonLat[,1], ylim=mapLonLat[,2], fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], col = .getColor("black",.1), pch=15, cex=.5)

Catch effort as shaded blocks.

Here is the cluster analysis again using the hierarchical clustering function hclust and related functions in package dendextend.

cory <- cor(ydata)
yclust <- hclust(dist(t(ydata)), method='complete')

dend <- as.dendrogram(yclust)
dend <- color_branches(dend, k=3)
dend <- hang.dendrogram(dend,hang_height=0.1)
dend <- set(dend, "labels_cex", 0.5)
par(mfrow=c(1,2),mar = c(3,2,2,3))
plot(dend, horiz =  TRUE,  nodePar = list(cex = .007))

A movie could go like this:

ncols <- 100

colorGrad <- colorRampPalette( RColorBrewer::brewer.pal(5, 'YlOrRd') )
colors    <- colorGrad(ncols)

cscale <- seq(0,1,length.out=ncols)

yrMon  <- cbind( as.character(xdata$yr), as.character(xdata$month) )
yrMon  <- apply( yrMon, 1, paste0, collapse='_')

times <- sort(unique(yrMon))

tmp <- matrix( as.numeric( unlist( strsplit( times, '_' ) ) ), ncol=2, byrow=T )

ord <- order(tmp[,1], tmp[ ,2])
times <- times[ord]

spec <- 'Swordfish'                  # I could select a different species
most <- max(ydata[,spec], na.rm=T)

par(bty='n', mar=c(.1, .1, 1, .1))

for(j in 1:length(times)){
  
  wj <- which( yrMon == times[j] & is.finite(ydata[,spec]) )
  yj <- (ydata[wj,spec]/most)^.2
  
  colj <- colors[ findInterval(yj,cscale) ]
  z   <- rep(1, length(wj))
  
  symbols(lonLat[wj,1],lonLat[wj,2], squares=z, inches=F, fg = colj, 
          bg = colj, add=F, xlim=mapLonLat[,1], ylim=mapLonLat[,2])
   maps::map(xlim=mapLonLat[,1], ylim=mapLonLat[,2], fill=T, col='wheat', add=T)
  title(times[j])
  readline(prompt="Press [enter] to continue")
}

Here is a model to estimate parameters

library(gjam)
sqrtEffort <- sqrt(edata$HOOKS_BETW)

ydata <- cbind(sqrtEffort,ydata)
S <- ncol(ydata)
typeNames <- rep('DA', S)
typeNames[1] <- 'CA'

effort <- list(columns = 2:S, values = sqrtEffort + 1)

#xdata$yr <- as.factor(xdata$yr)
ml  <- list(ng = 600, burnin = 200, typeNames = typeNames, effort = effort)
out <- gjam(~ SST*yr + season, xdata, ydata, modelList = ml)
           
plotPars <- list(GRIDPLOTS = T, SIGONLY=F)
gjamPlot(out, plotPars)

Here are the interactions:

load( 'gjamLongLine.rdata', verbose = T )
b <- summary(out)$Coefficients
B <- out$parameters$betaStandXTable
bmu <- B[,1]
bci <- B[,3:4]

snames <- matrix(unlist(strsplit(rownames(B),'_')),ncol=2, byrow=T)[,1]

# first time use this #####################
ss  <- c('_SST','_yr','_SST:yr')
wj  <- which(endsWith(rownames(B),'_SST'))

# second time use this ####################
ss  <- c('_yr','_seasonJAS','_seasonJFM','_seasonOND')
wj  <- which(endsWith(rownames(B),'_yr'))

# both times do this ####################
x   <- 1:length(wj)
ord <- order(bmu[wj])
col <- rep('black',length(x))
col[ B[wj[ord],3] > 0 ] <- 'brown'
col[ B[wj[ord],4] < 0 ] <- 'darkblue'

par(mfrow=c(1,4), mar=c(1,1,1,.1), omi=c(1,1,.4,1), bty='n')

for(j in 1:length(ss)){
  
  wj <- endsWith(rownames(B),ss[j])
  bmu <- B[wj,1][ord]
  bci <- B[wj,3:4][ord,]

  ylim <- range(bci)
  if(j == 1)ylim <- c(-10, 10)

  plot(bmu, x, xlim=ylim, yaxt = 'n', xlab='', col=col)
  abline(v=0, lty=2)
  segments(bci[,1],x, bci[,2],x, col=col) 
  title(ss[j])
  if(j == 1){
    pos <- rep(2, nrow(b))
    pos[1:(nrow(b)/2)] <- 4
    text(bci[,2], 1:nrow(b),snames[wj][ord], srt=0, pos=pos, col=col)
  }
}
mtext('Parameter value', side=1, outer=T, line=2)
*Coefficients ordered by SST  response, including the interaction with year `yr`.*

Coefficients ordered by SST response, including the interaction with year yr.

*Coefficients ordered by time trend (variable `yr`).*

Coefficients ordered by time trend (variable yr).

The interpretation of these graphs is as follows:

For example, Sandbar is increasing over time and it tends to be caught in cold waters. The positive interaction suggests that its rate of catch increase is greatest in warm waters. Equivalently, the negative effect of temperature is reduced over time.

Conversely, hammerhead are in decline and they tend to be caught in cold waters. The negative interaction suggests that the decline is fastest in cold waters. Alternatively, the effect of SST has diminished over time.

For next time

On the basis of the last discussion we will think about variables in xdata that might provide insight on changes in catch. Here are some variables:

Bathy - water depth

SM - distance to sea mount

CS - distance to coastline

For next time, think of an interaction to put in the model. Place in dropbox a few sentences with your hypothesis of what will be sign of the main effects, the interaction, and why.