Resources from last discussion
In the media
China’s Appetite Pushes Fisheries to the Brink (NY Times), the deplete of tuna.
Taking the Oceans’ Temperature, Scientists Find Unexpected Heat (NY Times), it’s not just fishing.
Science resources
Tuna and swordfish catch in the U.S. northwest Atlantic longline fishery in relation to mesoscale eddies, the long-line fishing data set
Data Descriptor: A database of global marine commercial, small-scale, illegal and unreported fisheries catch 1950–2014, trends in catch
Catch reconstructions reveal that global marine fisheries catches are higher than reported and declining, declining catch.
ICCAT Must Commit to Rebuilding Stocks and Enacting Fisheries Reforms (PEW trust), A call for action on fisheries reform.
Objectives
- Learn ways to explore/visualize data and relate it back to decisions on fisheries management
For class today
Each group member will prepare a brief summary of their two science papers that states at least the following, the question being addressed, the main result of the study, and your critique. You will have about 2 paragraphs per paper.
Place an outline of the additional info that you will cover in the brief presentataion on dropbox. This will consist of several bullets with links to references.
The discussion that follows will touch on at least the following questions:
Describe where are the threats to globally fisheries and why they are occurring
How good are the data? Are there big holes in our understanding?
How might the interaction between fishing and other variables, such as temperature change, affect the results?
What are some of the ways to mitigate the problem of overfishing?
For next time
We will look for interactions in the data involving season using the code below. So you don’t need to jump between files, the first part of this shows the code from last time. Following that is new code for interactions.
The longline data
We have available to us the long-line data used in the Tuna and swordfish catch paper that was included in the discussion. Here is software we will need:
library(corrplot)
library(dendextend)
library(gjam)
library(colorspace)
library('RColorBrewer')
source('clarkFunctions2021.r')Read in the file here:
## [1] 259350 152
This is a large data set and may take a minute to ingest, with 259350 rows. I begin by partitioning this data.frame into two parts, one for predictors and another for responses:
The data.frame xdata includes variables related to fishing effort (`HOOKS_SET,
The responses in data.frame ydata include the catch for species.
The spatial extent of the data is shown here.
lonLat <- xdata[,c('Longitude','Latitude')]
maps::map(xlim=c(-100,-10), ylim=c(0,55), fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], col = .getColor('brown',.03), cex=.2)Fishing effort for longline data.
You may notice that I used a semi-transparent symbol color so that areas with intense activity show up as darker on the map, rather than simply masking one another.
The data uses the number -9999 to indicate no data. I replace these entries with NA. I then construct a correlation, which helps me to see the relationships between variables:
## HOOKS_SET HOOKS_BETW LIGHT_STIC MAINLINE_S GANGION_LE FLOATLINE_
## HOOKS_SET 1.00 0.07 -0.24 0.28 -0.18 -0.15
## HOOKS_BETW 0.07 1.00 -0.08 0.00 0.04 0.06
## LIGHT_STIC -0.24 -0.08 1.00 0.01 0.03 -0.03
## MAINLINE_S 0.28 0.00 0.01 1.00 0.06 0.00
## GANGION_LE -0.18 0.04 0.03 0.06 1.00 0.45
## FLOATLINE_ -0.15 0.06 -0.03 0.00 0.45 1.00
## BAIT -0.11 -0.04 0.27 -0.07 -0.14 -0.17
## SST -0.18 0.00 -0.09 -0.01 0.11 0.08
## TKE -0.18 0.00 -0.02 -0.15 -0.11 -0.07
## Bathy -0.22 0.04 0.02 -0.21 -0.04 -0.03
## AMO 0.27 0.00 -0.02 0.03 -0.25 -0.32
## NAO -0.19 -0.02 0.04 -0.01 0.16 0.18
## WNAO -0.16 -0.01 0.05 -0.01 0.10 0.11
## CHLA_AQUA_8DAY 0.03 0.19 -0.06 -0.15 -0.13 -0.05
## BAIT SST TKE Bathy AMO NAO WNAO CHLA_AQUA_8DAY
## HOOKS_SET -0.11 -0.18 -0.18 -0.22 0.27 -0.19 -0.16 0.03
## HOOKS_BETW -0.04 0.00 0.00 0.04 0.00 -0.02 -0.01 0.19
## LIGHT_STIC 0.27 -0.09 -0.02 0.02 -0.02 0.04 0.05 -0.06
## MAINLINE_S -0.07 -0.01 -0.15 -0.21 0.03 -0.01 -0.01 -0.15
## GANGION_LE -0.14 0.11 -0.11 -0.04 -0.25 0.16 0.10 -0.13
## FLOATLINE_ -0.17 0.08 -0.07 -0.03 -0.32 0.18 0.11 -0.05
## BAIT 1.00 -0.10 0.05 0.04 0.07 -0.08 -0.04 0.00
## SST -0.10 1.00 0.13 0.03 0.02 -0.03 -0.03 -0.26
## TKE 0.05 0.13 1.00 0.07 -0.02 -0.02 -0.01 -0.01
## Bathy 0.04 0.03 0.07 1.00 0.03 0.00 0.00 0.19
## AMO 0.07 0.02 -0.02 0.03 1.00 -0.42 -0.31 0.03
## NAO -0.08 -0.03 -0.02 0.00 -0.42 1.00 0.48 -0.01
## WNAO -0.04 -0.03 -0.01 0.00 -0.31 0.48 1.00 0.03
## CHLA_AQUA_8DAY 0.00 -0.26 -0.01 0.19 0.03 -0.01 0.03 1.00
Here is correlation structure in the responses:
cc <- cor(ydata)
diag(cc) <- 0
corrplot(cc, type='lower', diag=F, cl.lim = c(-.2, .75), is.corr = F)Correlations between fish are mostly low.
Clearly, there are not large correlations between species. Here is a cluster analysis using the hierarchical clustering function hclust and related functions in package dendextend.
cory <- cor(ydata)
dsigma <- as.dist( cov2Dist(cory) )
yclust <- hclust(dsigma, method='complete')
#save clusters
ncluster <- 6
clusters <- cutree(yclust,k=ncluster) #membership by no. of clusters
dend <- as.dendrogram(yclust)
dend <- color_branches(dend, k=3)
dend <- hang.dendrogram(dend,hang_height=0.1)
dend <- set(dend, "labels_cex", 0.5)
plot(dend, horiz = TRUE, nodePar = list(cex = .007))Dendrogram shows similarity between species.
Here are trends through time for swordfish:
y <- sqrt( ydata[,'Swordfish'] )
plot(jitter( xdata$Year ), y, cex=.2, xlab='Year',ylab='Catch')
spl <- spline(y ~ xdata$Year)## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
Lack of trend in catch data.
Here is catch effort, gauged as number of hooks set:
plot(jitter( xdata$Year ), xdata$HOOKS_SET, cex=.2,
xlab='Year',ylab='Hooks set', log='y')
spl <- spline(xdata$HOOKS_SET ~ xdata$Year)## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
Catch effort shows an increase over time.
Here is catch per effort (hooks):
y <- ydata[,'Swordfish']/xdata$HOOKS_SET
plot(jitter( xdata$Year ), y, cex=.2, log='y',
xlab='Year',ylab='Catch/Hook')## Warning in xy.coords(x, y, xlabel, ylabel, log): 61750 y values <= 0 omitted
## from logarithmic plot
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values
Trend in catch per hook.
Here is a map of changing catch location
colors <- c('blue','green','orange','brown') # color gradient
years <- sort(unique(xdata$Year))
nyr <- length(years)
ycol <- round(quantile(years, c(0, .33, .67, 1)))
cf <- colorRampPalette(colors)(nyr)
size <- y/max(y)
colr <- cf[match(xdata$Year, years)]
maps::map(xlim=c(-110,-30), ylim=c(7,55), fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], cex = size^.5,
col = colr)
legend('topleft',legend=c(ycol), text.col=colors)Change in distribution of catch over time.
Season effects
I am concerned about the foregoing analysis because I know that temperature is increasing over time, just as is fishing effort. However, I also know that temperature is changing throughout the year. Perhaps fish are being affected by simply shifting where they live at different times of year, seeking and finding a preferred temperature year round. I think I might identify such an effect by including an interaction between SST and season.
Here are a few additional functions I will use to create a data set that is aggregated by location and season:
lonLatSeasonGrid <- function(variable, xdata, FUN=mean){
# variable - a continuous variable to be averaged in xdata
# xdata - must include columns for variable, lon, lat, season
tmp <- suppressWarnings( by(variable,
list(lon=xdata$lon, lat=xdata$lat,
season=xdata$season),
FUN, na.rm=T) )
gridframe <- numeric(0)
dm <- dim(tmp)
longrid <- as.numeric(dimnames(tmp)[[1]])
latgrid <- as.numeric(dimnames(tmp)[[2]])
ssngrid <- dimnames(tmp)[[3]]
longrid <- matrix(longrid,dm[1],dm[2])
latgrid <- matrix(latgrid,dm[1],dm[2], byrow=T)
for(j in 1:dim(tmp)[3]){
lonj <- as.vector(longrid)
latj <- as.vector(latgrid)
ssnj <- rep(ssngrid[j], length(lonj))
jdat <- data.frame(lon = lonj, lat = latj, season = ssnj,
x = as.vector(tmp[,,j]))
gridframe <- rbind( gridframe, jdat )
}
gridframe
}
.getColor <- function(col,trans){
# trans - transparency fraction [0, 1]
tmp <- col2rgb(col)
rgb(tmp[1,], tmp[2,], tmp[3,], maxColorValue = 255,
alpha = 255*trans, names = paste(col,trans,sep='_'))
}
.aggData <- function(cnames, data, gather, FUN){
#cnames - column names in data to operate on
#gather - list of indices, all of same length
cnames <- cnames[cnames %in% colnames(data)]
if(length(cnames) == 0)return( list(data = numeric(0)) )
FAC <- F
df <- data[,cnames]
if(FUN == 'factor'){
FAC <- T
tf <- fnames <- character(0)
df <- numeric(0)
for(j in 1:length(cnames)){
kf <- as.character(data[,cnames[j]])
tf <- cbind(tf, kf)
fnames <- append(fnames,list(sort(unique(kf))))
df <- cbind(df, match(kf,fnames[[j]]) )
}
colnames(df) <- cnames
FUN <- 'max'
}
tmp <- aggregate(df, by=gather, FUN=FUN, na.rm=T)
if(FAC){
for(j in 1:length(cnames)){
kf <- fnames[[j]][tmp[,cnames[j]]]
tmp[,cnames[j]] <- as.factor(kf)
}
}
tmp
}Read the data again. Here are the columns I want to save:
specNames <- c("Swordfish","Bigeye","Bluefin","Yellowfin","Albacore",
"Blackfin","Dolphin","Wahoo", "King","Escolar","Skipjack",
"Blue","Hammerhe_9","Thresher","Blacktip","Dusky",
"Scalloped","Smooth","Night","Whitetip","Silky",
"BEThresher","White","Tiger","Sandbar",
"MakoLong","MakoShort","WhiteMarli","BlueMarlin","Sailfish")
xnames <- c("FID","Year","Month","Day","Latitude","Longitude",
"HOOKS_SET","HOOKS_BETW","LIGHT_STIC","MAINLINE_S","GANGION_LE",
"FLOATLINE_","BAIT",
"SST","TKE","Bathy","AMO","NAO","WNAO","CHLA_AQUA_8DAY",
"CHLA_SEAWIFS_8DAY","CHLA_TERRA_8DAY","CS","SM")
xdata <- longline[,xnames]
ydata <- longline[,specNames]Here is some aggregation by season, longitude, and latitude:
# seasons
snames <- c('JFM','AMJ','JAS','OND')
season <- findInterval(xdata$Month, seq(1, 12, by=3) )
xdata$season <- as.factor(snames[season])
# lon/lat
cnames <- c("SST","TKE","Bathy","Month")
lon <- round(xdata$Longitude)
lat <- round(xdata$Latitude)
gather <- with(xdata, list(lon = lon, lat = lat, yr = Year, season = season))
xtmp <- .aggData(cnames, xdata, gather, FUN='mean')
# sum effort
enames <- c('HOOKS', 'HOOKS_BETW', 'MAINLINE_S')
etmp <- .aggData(enames, xdata, gather, FUN='sum')
# sum catch
ytmp <- .aggData(colnames(ydata), ydata, gather, FUN='sum')
ytmp <- ytmp[,!colnames(ytmp) %in% names(gather)]
ytmp <- as.matrix(ytmp)
keep <- suppressWarnings( apply(ytmp, 1, max, na.rm=T) )
keep <- which(is.finite(keep))
ydata <- ytmp[keep,]
xdata <- xtmp[keep,]
edata <- etmp[keep,]
rm(longline)Here’s a map of the aggregated lon/lat data:
years <- sort(unique(xdata$yr))
nyr <- length(years)
lonLat <- xdata[,c('lon','lat')]
mapLonLat <- apply(lonLat,2,range)
maps::map(xlim=mapLonLat[,1], ylim=mapLonLat[,2], fill=T, col='wheat')
points(lonLat[,1],lonLat[,2], col = .getColor("black",.1), pch=15, cex=.5)Catch effort as shaded blocks.
Here is the cluster analysis again using the hierarchical clustering function hclust and related functions in package dendextend.
cory <- cor(ydata)
yclust <- hclust(dist(t(ydata)), method='complete')
dend <- as.dendrogram(yclust)
dend <- color_branches(dend, k=3)
dend <- hang.dendrogram(dend,hang_height=0.1)
dend <- set(dend, "labels_cex", 0.5)
par(mfrow=c(1,2),mar = c(3,2,2,3))
plot(dend, horiz = TRUE, nodePar = list(cex = .007))A movie could go like this:
ncols <- 100
colorGrad <- colorRampPalette( RColorBrewer::brewer.pal(5, 'YlOrRd') )
colors <- colorGrad(ncols)
cscale <- seq(0,1,length.out=ncols)
yrMon <- cbind( as.character(xdata$yr), as.character(xdata$month) )
yrMon <- apply( yrMon, 1, paste0, collapse='_')
times <- sort(unique(yrMon))
tmp <- matrix( as.numeric( unlist( strsplit( times, '_' ) ) ), ncol=2, byrow=T )
ord <- order(tmp[,1], tmp[ ,2])
times <- times[ord]
spec <- 'Swordfish' # I could select a different species
most <- max(ydata[,spec], na.rm=T)
par(bty='n', mar=c(.1, .1, 1, .1))
for(j in 1:length(times)){
wj <- which( yrMon == times[j] & is.finite(ydata[,spec]) )
yj <- (ydata[wj,spec]/most)^.2
colj <- colors[ findInterval(yj,cscale) ]
z <- rep(1, length(wj))
symbols(lonLat[wj,1],lonLat[wj,2], squares=z, inches=F, fg = colj,
bg = colj, add=F, xlim=mapLonLat[,1], ylim=mapLonLat[,2])
maps::map(xlim=mapLonLat[,1], ylim=mapLonLat[,2], fill=T, col='wheat', add=T)
title(times[j])
readline(prompt="Press [enter] to continue")
}Here is a model to estimate parameters
library(gjam)
sqrtEffort <- sqrt(edata$HOOKS_BETW)
ydata <- cbind(sqrtEffort,ydata)
S <- ncol(ydata)
typeNames <- rep('DA', S)
typeNames[1] <- 'CA'
effort <- list(columns = 2:S, values = sqrtEffort + 1)
#xdata$yr <- as.factor(xdata$yr)
ml <- list(ng = 600, burnin = 200, typeNames = typeNames, effort = effort)
out <- gjam(~ SST*yr + season, xdata, ydata, modelList = ml)
plotPars <- list(GRIDPLOTS = T, SIGONLY=F)
gjamPlot(out, plotPars)Here are the interactions:
load( 'gjamLongLine.rdata', verbose = T )
b <- summary(out)$Coefficients
B <- out$parameters$betaStandXTable
bmu <- B[,1]
bci <- B[,3:4]
snames <- matrix(unlist(strsplit(rownames(B),'_')),ncol=2, byrow=T)[,1]
# first time use this #####################
ss <- c('_SST','_yr','_SST:yr')
wj <- which(endsWith(rownames(B),'_SST'))
# second time use this ####################
ss <- c('_yr','_seasonJAS','_seasonJFM','_seasonOND')
wj <- which(endsWith(rownames(B),'_yr'))
# both times do this ####################
x <- 1:length(wj)
ord <- order(bmu[wj])
col <- rep('black',length(x))
col[ B[wj[ord],3] > 0 ] <- 'brown'
col[ B[wj[ord],4] < 0 ] <- 'darkblue'
par(mfrow=c(1,4), mar=c(1,1,1,.1), omi=c(1,1,.4,1), bty='n')
for(j in 1:length(ss)){
wj <- endsWith(rownames(B),ss[j])
bmu <- B[wj,1][ord]
bci <- B[wj,3:4][ord,]
ylim <- range(bci)
if(j == 1)ylim <- c(-10, 10)
plot(bmu, x, xlim=ylim, yaxt = 'n', xlab='', col=col)
abline(v=0, lty=2)
segments(bci[,1],x, bci[,2],x, col=col)
title(ss[j])
if(j == 1){
pos <- rep(2, nrow(b))
pos[1:(nrow(b)/2)] <- 4
text(bci[,2], 1:nrow(b),snames[wj][ord], srt=0, pos=pos, col=col)
}
}
mtext('Parameter value', side=1, outer=T, line=2)
Coefficients ordered by SST response, including the interaction with year yr.
Coefficients ordered by time trend (variable yr).
The interpretation of these graphs is as follows:
Positive yr: Species that increase through time
Positive SST: Species caught in warm waters
Positive SST:yr: Two (equally valid) interpretations, i) Effect of SST increases through time (yr) and ii) Rate of increase in catch increases with SST.
For example, Sandbar is increasing over time and it tends to be caught in cold waters. The positive interaction suggests that its rate of catch increase is greatest in warm waters. Equivalently, the negative effect of temperature is reduced over time.
Conversely, hammerhead are in decline and they tend to be caught in cold waters. The negative interaction suggests that the decline is fastest in cold waters. Alternatively, the effect of SST has diminished over time.
For next time
On the basis of the last discussion we will think about variables in xdata that might provide insight on changes in catch. Here are some variables:
Bathy - water depth
SM - distance to sea mount
CS - distance to coastline
For next time, think of an interaction to put in the model. Place in dropbox a few sentences with your hypothesis of what will be sign of the main effects, the interaction, and why.