sam process errorThe following text is a step-by-step calculation and visualization of the process error term in sam.
In the old fashioned cohort models the stock equation is expressed as:
\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y})}\)
In the sam program there is a process error in the stock equation:
\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y})} e^{(d_{a,y})}\)
where
\(log(d_{a,y}) ~ N(0,s^2)\)
Read data and align the stock in numbers in the end of the year (beginning of next year) with the stock in the beginning of the year:
require(knitr)
require(reshape2)
require(plyr)
require(ggplot2)
require(xtable)
#require(fishvice)
#mac <- sam_get_results(assessment = "NEA-Mac-WGWIDE-2014")
#save(mac,file="mac.RData")
attach("mac.RData")
rbya <- mac$rbya
x <- rbya[,c("year","age","n")]
x$year <- x$year - 1
x$age <- x$age - 1
names(x)[3] <- "n.end"
d <- join(rbya[,c("year","age","n","m","f","cW")],x, by=c("year","age"))
Lets for now just check for cases where the stock in numbers in the end of the year are greater than in the beginning of the year:
i <- d$n.end > d$n & d$age < 11 & d$year < 2014
kable(d[i,])
| year | age | n | m | f | cW | n.end | |
|---|---|---|---|---|---|---|---|
| 22 | 2001 | 0 | 4990267 | 0.15 | 0.0066 | 0.069 | 5797863 |
| 26 | 2005 | 0 | 5980412 | 0.15 | 0.0027 | 0.048 | 6990062 |
| 30 | 2009 | 0 | 4704359 | 0.15 | 0.0042 | 0.104 | 4765915 |
| 65 | 2009 | 1 | 3988796 | 0.15 | 0.0111 | 0.153 | 4032915 |
| 86 | 1995 | 2 | 2020792 | 0.15 | 0.0554 | 0.234 | 2049282 |
OK not many cases, but one starts to scratch ones head a bit :-)
One way to visualize the magnitude of the process errror is to calculate the \(Z_{a,y}\) from the tabulated sam stock in numbers and then compare that with the \(Z_{a,y} = F_{a,y} + M_{a,y}\), where in the latter case \(F_{a,y}\) is the tabulated sam fishing mortality estimates.
The \(Z\) using the stock-in-numbers is calculated as:
\(_{N}Z_{a,y} = log(N_{a,y}/N_{a+1,y+1})\)
The \(Z\) derived from the fishing mortality is:
\(_{F}Z_{a,y} = F_{a,y} + M_{a,y}\)
For comparison then one could do:
\(_{z}D_{a,y} = _{N}Z_{a,y} - _{F}Z_{a,y}\)
One can think of the \(_{z}D_{a,y}\) term as an additional natural mortality term (which can be either positive or negative).
d$z.n <- log(d$n/d$n.end)
d$z.f <- d$f + d$m
d$z.d <- d$z.n - d$z.f
d <- d[d$age < 11 & d$year < 2014,]
kable(head(d[,c("year","age","m","f","z.f","z.n","z.d")], n=10))
| year | age | m | f | z.f | z.n | z.d |
|---|---|---|---|---|---|---|
| 1980 | 0 | 0.15 | 0.0101 | 0.1601 | 0.120 | -0.0401 |
| 1981 | 0 | 0.15 | 0.0102 | 0.1602 | 0.076 | -0.0842 |
| 1982 | 0 | 0.15 | 0.0102 | 0.1602 | 0.261 | 0.1008 |
| 1983 | 0 | 0.15 | 0.0103 | 0.1603 | 0.289 | 0.1287 |
| 1984 | 0 | 0.15 | 0.0105 | 0.1605 | 0.028 | -0.1325 |
| 1985 | 0 | 0.15 | 0.0105 | 0.1605 | 0.188 | 0.0275 |
| 1986 | 0 | 0.15 | 0.0105 | 0.1605 | 0.191 | 0.0305 |
| 1987 | 0 | 0.15 | 0.0104 | 0.1604 | 0.073 | -0.0874 |
| 1988 | 0 | 0.15 | 0.0105 | 0.1605 | 0.207 | 0.0465 |
| 1989 | 0 | 0.15 | 0.0106 | 0.1606 | 0.135 | -0.0256 |
When the \(_{z}D_{a,y}\) is negative it conceptually means that the total mortality “in” sam is less than what one gets from the conventional calculation: \(Z = F + M\). And vica versa.
In mathematical terms one can now express the stock equation as:
\(N_{a+1,y+1} = N_{a,y} e^{-(F_{a,y} + M_{a,y} + D_{a,y})}\)
Overall distribution of the “difference”:
ggplot(d,aes(z.d)) +
theme_bw() +
geom_bar() +
labs(x="Process error expressed as deviations in mortality")
Pattern through time:
ggplot(d,aes(year,z.d)) +
theme_bw() +
geom_text(aes(label=age)) +
stat_smooth(span=0.1) +
labs(x="",y="Process error expressed as deviations in mortality")
OK, there is some structure in time - always an issue. One way to interpred the graph, focusing only on the period after 2000 is that sam estimates an additional mortality in the first part of the period, but after and including the year 2007 the “additional” mortality is less than what one gets from the conventional sum of \(F\) and \(M\).
Another way to come to grisp with the magnitude of the process error term is to look at it in terms of numbers or biomass.
# Calculate the stock in numbers using the estimated F assuming no process error
d$n.end2 <- d$n * exp(-(d$f + d$m))
# Calculate the difference
d$n.d <- d$n.end - d$n.end2
ggplot(d,aes(year,n.d/1e3)) +
theme_bw() +
geom_bar(stat="identity") +
facet_wrap(~ age, scale="free_y") +
labs(x="", y="Number of fish [thousands]", title="Process error expressed as deviations in number of fish")
Here, positive numbers mean that fish is added over the single year through the process error. Negative numbers equivalently mean that fish is lost through other processes than \(M_{a,y}\) and the sam estimated \(F_{ay}\),
Here one simply converts the process error expressed as deviations in the number of fish to biomass by multiplying with catch weights. And then sums the stuff over each year.
x <- ddply(d,c("year"),summarise,b=sum(n.d * cW,na.rm=TRUE)/1e3)
ggplot(x[x$year <= 2013,],aes(year,b)) +
theme_bw() +
geom_bar(stat="identity") +
labs(x="",y="Mass [thousand tonnes]",title="Process error expressed as deviation in mass")