Preamble


The following text is a step-by-step calculation and visualization of the process error term in sam.

In the old fashioned cohort models the stock equation is expressed as:

\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y})}\)

In the sam program there is a process error in the stock equation:

\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y})} e^{(d_{a,y})}\)

where

\(log(d_{a,y}) ~ N(0,s^2)\)

Align the stock in numbers


Read data and align the stock in numbers in the end of the year (beginning of next year) with the stock in the beginning of the year:

require(knitr)
require(reshape2)
require(plyr)
require(ggplot2)
require(xtable)
#require(fishvice)
#mac <- sam_get_results(assessment = "NEA-Mac-WGWIDE-2014")
#save(mac,file="mac.RData")
attach("mac.RData")
rbya <- mac$rbya
x <- rbya[,c("year","age","n")]
x$year <- x$year - 1
x$age <- x$age - 1
names(x)[3] <- "n.end"
d <- join(rbya[,c("year","age","n","m","f","cW")],x, by=c("year","age"))

Lets for now just check for cases where the stock in numbers in the end of the year are greater than in the beginning of the year:

i <- d$n.end > d$n & d$age < 11 & d$year < 2014
kable(d[i,])
year age n m f cW n.end
22 2001 0 4990267 0.15 0.0066 0.069 5797863
26 2005 0 5980412 0.15 0.0027 0.048 6990062
30 2009 0 4704359 0.15 0.0042 0.104 4765915
65 2009 1 3988796 0.15 0.0111 0.153 4032915
86 1995 2 2020792 0.15 0.0554 0.234 2049282

OK not many cases, but one starts to scratch ones head a bit :-)

Calculate and visualize the process error


One way to visualize the magnitude of the process errror is to calculate the \(Z_{a,y}\) from the tabulated sam stock in numbers and then compare that with the \(Z_{a,y} = F_{a,y} + M_{a,y}\), where in the latter case \(F_{a,y}\) is the tabulated sam fishing mortality estimates.

The \(Z\) using the stock-in-numbers is calculated as:

\(_{N}Z_{a,y} = log(N_{a,y}/N_{a+1,y+1})\)

The \(Z\) derived from the fishing mortality is:

\(_{F}Z_{a,y} = F_{a,y} + M_{a,y}\)

For comparison then one could do:

\(_{z}D_{a,y} = _{N}Z_{a,y} - _{F}Z_{a,y}\)

One can think of the \(_{z}D_{a,y}\) term as an additional natural mortality term (which can be either positive or negative).

d$z.n <- log(d$n/d$n.end)
d$z.f <- d$f + d$m
d$z.d  <- d$z.n - d$z.f
d <- d[d$age < 11 & d$year < 2014,]
kable(head(d[,c("year","age","m","f","z.f","z.n","z.d")], n=10))
year age m f z.f z.n z.d
1980 0 0.15 0.0101 0.1601 0.120 -0.0401
1981 0 0.15 0.0102 0.1602 0.076 -0.0842
1982 0 0.15 0.0102 0.1602 0.261 0.1008
1983 0 0.15 0.0103 0.1603 0.289 0.1287
1984 0 0.15 0.0105 0.1605 0.028 -0.1325
1985 0 0.15 0.0105 0.1605 0.188 0.0275
1986 0 0.15 0.0105 0.1605 0.191 0.0305
1987 0 0.15 0.0104 0.1604 0.073 -0.0874
1988 0 0.15 0.0105 0.1605 0.207 0.0465
1989 0 0.15 0.0106 0.1606 0.135 -0.0256

When the \(_{z}D_{a,y}\) is negative it conceptually means that the total mortality “in” sam is less than what one gets from the conventional calculation: \(Z = F + M\). And vica versa.

In mathematical terms one can now express the stock equation as:

\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y} + D_{a,y})}\)

Overall distribution of the “difference”:

ggplot(d,aes(z.d)) + 
  theme_bw() +
  geom_bar() +
  labs(x="Process error expressed as deviations in mortality")

plot of chunk unnamed-chunk-4

Pattern through time:

ggplot(d,aes(year,z.d)) + 
  theme_bw() +
  geom_text(aes(label=age)) +
  stat_smooth(span=0.1) +
  labs(x="",y="Process error expressed as deviations in mortality")

plot of chunk unnamed-chunk-5

OK, there is some structure in time - always an issue. One way to interpred the graph, focusing only on the period after 2000 is that sam estimates an additional mortality in the first part of the period, but after and including the year 2007 the “additional” mortality is less than what one gets from the conventional sum of \(F\) and \(M\).

Vizualizing the process error as number of fish


Another way to come to grisp with the magnitude of the process error term is to look at it in terms of numbers or biomass.

# Calculate the stock in numbers using the estimated F assuming no process error
d$n.end2 <- d$n * exp(-(d$f + d$m))
# Calculate the difference
d$n.d <- d$n.end - d$n.end2 

ggplot(d,aes(year,n.d/1e3)) +
  theme_bw() +
  geom_bar(stat="identity") + 
  facet_wrap(~ age, scale="free_y") +
  labs(x="", y="Number of fish [thousands]", title="Process error expressed as deviations in number of fish")

plot of chunk unnamed-chunk-6

Here, positive numbers mean that fish is added over the single year through the process error. Negative numbers equivalently mean that fish is lost through other processes than \(M_{a,y}\) and the sam estimated \(F_{ay}\),

Vizualizing the process error as annual biomass


Here one simply converts the process error expressed as deviations in the number of fish to biomass by multiplying with catch weights. And then sums the stuff over each year.

x <- ddply(d,c("year"),summarise,b=sum(n.d * cW,na.rm=TRUE)/1e3)
ggplot(x[x$year <= 2013,],aes(year,b)) + 
  theme_bw() +
  geom_bar(stat="identity") +
  labs(x="",y="Mass [thousand tonnes]",title="Process error expressed as deviation in mass")

plot of chunk unnamed-chunk-7