This is a working document describing how we might build on the age structured model that was proposed by Schwacke et al. (2017):
Schwacke et al. 2017 Quantifying injury to common bottlenose dolphins from the Deepwater Horizon oil spill using an age-, sex- and class-structured population model. Endangered Species Research, 33: 265-279.
for Bay Sound and Estuary bottlenose dolphins (BSE BND) stocks, and extend it for species (and stocks) that were originally modeled by a stage structured model. Those correspond to all the other species and stocks in the GOM besides BSE BND. The idea was to try to use the Siler model results for BSE BND and “expand” them to other species.
While the document was originally intended as a where-we-are-today document on the 30th March 2020, aiming to set the stage to decide where to go next, it since then evolved into its own working document. It now fully describes the thought process and different considerations about scaling the BSE BND survival information used in the age structured model of Schwacke et al. (2017) to the different stocks.
This subsection can be used for easier navigation across the document. Each section is hyperlinked, and in each section there are hyperlinks to return here.
The bulk of this document starts with prior references and thoughts about how one might scale survival across different species that might have different longevities.
This is followed by a quick exploration of the Siler model related material sent to TAM by LS via email on the 18th March 2020, including the model considered, the R functions used and the posterior distribution of the model paramenters considered in Schwacke et al. (2017).
This is followed by some comments about how to extend a function in general, building up to what we will have to do next.
We then build on it to investigate the Siler model and extending dolphins to other species and how it might be used to estimate survival probabilities per age by extending the BND model to the stocks originally addressed by the stage structured model. As a reality check we try to do this for the population of Elephants reported in Lahdenpera et al. (2018). We separate that in two subsections. First we look at doing it based on mean age at death (MAD), then based on age at first reproduction (AFR).
This is followed by a side step on the mean age of an animal sampled at random, a quantity that would be involved in the recovery of the survival and fecundity after exposure to oil. This is addressed in a separate document (How to land it).
We conclude with some questions that need solving before all this is put to rest. This is essentially a section with hyperlinks to points in the text that require further attention. These questions are represented as 4th order headings in the document, to facilitate navigation to them via automatic hyperlinks to headings. Hence, in terms of reading the document, these 4th order headings should be viewd as footnotes and can be ignored unless you are interested in the specific details. Note the number of the question does not refer to the order it appears in text, but to the chronological order by which it was raised. A missing number (e.g. question 3 is missing) therefore corresponds to a question that were raised and answered and the answer was not relevant to be kept here for posterity. Continuing the example, question 3 was motivated by an inconsistency that was induced by a bug in the code. The bug was found and fixed. The inconsistency was solved, so the question is not relevant anymore and hence it was deleted. Questions that might be interesting to keep a track of for posterity, because the corresponding answer is in itself useful, are kept in this document.
The idea of using information about survival from one species to simulate/estimate/model/draw insights about survival in other species is not new. In the following we list some references that have looked into the problem before.
The idea is that we have age indexed survivals from a Siler model, but we want to extend or shrink these to match a species with a diferent longevity. Thinking about a “simple” scaling, we could expand or contract a function such that the resulting survivals match a given target statistic. Different suggestions have been put forward for that target statistic, but each of these have advantages and disadvantages. These target statistics include (and this was an evolving discussion within the CARMMHA team):
Screenshot 1 of Krementz et al 1989.
Screenshot 1 of Eakin et al. 1994.
Screenshot 1 of Caswell et al. 1998.
A few other references on scaling survival functions and patterns in survival across different species, not so relevant and borderline useless, but here for future reference if needed:
Trends in scale and shape of survival curves https://www.nature.com/articles/srep00504
Survivorship Curves and Their Impact on the Estimation of Maximum Population Growth Rates https://www.jstor.org/stable/25592598?seq=1
Ageing and typical survivorship curves result from optimal resource allocation http://www.cyfronet.krakow.pl/~uxcichon/pub/eer.pdf
Unfortunately, at least for TAM, only late in the game he realized that Barlow & Boveng (1991)
Barlow & Boveng 1991 Modeling age-specific mortality for marine mammal populations. Marine Mammal Science 7: 50-65
was a very relevant source to this problem. Wording might be at times slightly inconsistent and implying we were not aware of this work originally. That paper is really good. Being aware of it would have helped solving many questions listed below, and certainly faster, inclusively as an example the reparametrization issue!
Here we provide some relevant screenshots from that paper, to keep in the back of our minds and inform the discussion.
Screenshot 1 of Barlow and Boveng 1991.
So essentially, this is what we did to get the Siler model to begin with (Ask Len - right? Were you just following Jay’s footsteps?)
Here’s the basic notation (see how this would have solved the reparametrization that I had to go at by trial and error?)
Screenshot between 1 and 2 of Barlow and Boveng 1991.
and the notion that functions can be re-scaled:
Screenshot 2 of Barlow and Boveng 1991.
A detail about how females are the focus here
Screenshot 3 of Barlow and Boveng 1991.
The species chosen as a model: the northern fur seal.
Screenshot 4 of Barlow and Boveng 1991.
The lack of information for cetaceans in general. At the date, but still overall true.
Screenshot 5 of Barlow and Boveng 1991.
The necessary caveats. Have I said I find Jay an amazing scientist before? Not screenshoted here, but the way he criticizes his own work should be an example to us all.
Screenshot 6 of Barlow and Boveng 1991.
Insights about issues of how hard it might be about finding a representative general model for all cetaceans and the idea of scaling such a model.
Screenshot 7 of Barlow and Boveng 1991.
A key comment about the why they think mean age is not a good idea. Something we need to get to grasps with
I actually have a feeling that scaling with respect to a mean value makes more sense than do it with respect to an extreme quantile: extremes are harder to estimate. We can estimate them, but they have necessarily much more variance associated with that say mean, or a non-extreme quartile, as they are much more dependent on any given sample not providing an outlier, say. I wonder what Jay would say about this today?
End of Question 9
Screenshot 8 of Barlow and Boveng 1991.
Making some suggestions for when the life table data is not available, i.e., for people like us…
Screenshot 9 of Barlow and Boveng 1991.
option 1 - combine info from a life table and a model
Screenshot 10 of Barlow and Boveng 1991.
option 2 - fix some of the parameters of the Siler model and estimate others
Screenshot 11 of Barlow and Boveng 1991.
option 3 - consider empirical Bayes with priors based on data
Screenshot 12 of Barlow and Boveng 1991.
Yes, here we are… still :)
Screenshot 13 of Barlow and Boveng 1991.
This material corresponds to two emails:
Hence, we have the values (in fact, we have samples from the posterior distribution for each of the 5 model parameters, and the growth rate - for the estimated parameters for the Siler model) and the code that allows to evaluate all the relevant functions given parameter values.
Here we explore the different datasets that Lori sent TAM
Note: these will not be really used here, so we are just checking the information is consistent with what we expected
Reading the female and male life table
FLT <- read.table("Female Life Table.txt", quote="\"", comment.char="")
MLT <- read.table("Male Life Table.txt", quote="\"", comment.char="")
This is weird. The data corresponds to a single column with 976061 lines, and there are 16001 (females) and 16001 (males) values of 0 in these. A detective look tells me the 0’s appear every 61 (female) and 61 (males) values in both files, and that can’t be a coincidence… these must correspond to the age classes, and the age 60 probability of survival of 0. After discussion with LS we can ignore everything after the first 61 values. But then again, these will not be used here.
Check this looks about right by plotting the first 61 values for each file
par(mfrow=c(1,1))
plot(0:60,MLT$V1[1:61],type="l",lty=1,ylab="Survival probability",xlab="Age",ylim=c(0,1))
lines(0:60,FLT$V1[1:61],lty=2)
legend("bottomleft",legend=c("Males","Females"),lty=1:2, inset=0.05)
Now look at the parameter values. Read the data
pf <- read.csv("var_female.csv")
pm <- read.csv("var_male.csv")
#remove useless ID first column
pf<-pf[,-1]
pm<-pm[,-1]
names(pf) <- names(pm) <- c("a1","a2","a3","b1","b3","rmean")
and check there are consistent.
First for females
Then for males
Just to get a feeling for the relevant functions, use the code provided by LS to plot these. We consider just the first value from the posterior distribution as an illustrative example.
The required functions used by LS were sent to TAM by LS on the 18th March 2020.
This is the proportion surviving to a given age
#funtions sent via email by LS to TAM on the 18th March 2020
#source("usefulfunctions.R")
# Proportion surviving to age x
lx <- function(x,a1,a2,a3,b1,b3) {
exp(-a2*x) * exp((-a1)*(1-exp(-b1*x))) * exp(a3*(1-exp(b3*x)))
}
ages <- 0:60
plot(ages,lx(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age")
lines(ages,lx(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.05)
Just for illustration, we also visualize here the variability around these point estimates. Nnote that above, and in most places below, the function shown is not a point estimate, just the first value from the posterior, taken as illustrative of a single example.
ages <- 0:70
ncols <- nrow(pf)
nrows <- length(ages)
#get a dataframe to use ggplot2
#objects to hold female and male realizations
lxFs <- matrix(NA,nrow=nrows,ncol=ncols)
lxMs <- matrix(NA,nrow=nrows,ncol=ncols)
#for each obseration of the posterior
for(i in 1:nrow(pf)){
#get the funtion
lxFs[,i] <- lx(ages,pf[i,1],pf[i,2],pf[i,3],pf[i,4],pf[i,5])
lxMs[,i] <- lx(ages,pm[i,1],pm[i,2],pm[i,3],pm[i,4],pm[i,5])
}
#arrange as data frame
lxFs2 <- data.frame(ages=ages,lxFs)
lxMs2 <- data.frame(ages=ages,lxMs)
# add means and relevant quantiles
lxFs2$mean=rowMeans(lxFs)
lxMs2$mean=rowMeans(lxMs)
lxFs2$q025=apply(X=lxFs,MARGIN = 1,FUN=quantile,probs=0.025)
lxMs2$q025=apply(X=lxMs,MARGIN = 1,FUN=quantile,probs=0.025)
lxFs2$q975=apply(X=lxFs,MARGIN = 1,FUN=quantile,probs=0.975)
lxMs2$q975=apply(X=lxMs,MARGIN = 1,FUN=quantile,probs=0.975)
#add the sex
lxFs2$sex="Female"
lxMs2$sex="Male"
#make single object
lxMF <- rbind(lxFs2,lxMs2)
#plot
#https://www.r-graph-gallery.com/104-plot-lines-with-error-envelopes-ggplot2.html
ggplot(data=lxMF, aes(x=ages, y=mean, ymin=q025, ymax=q975, fill=sex, linetype=sex)) + geom_line() +
geom_ribbon(alpha=0.5) +
xlab("Age") +
ylab("Proportion surviving")
A very useful function is the related age specific survival function (px
=lx
(x+1)/lx
(x))
# Age specific mortality rate
px <- function(x,a1,a2,a3,b1,b3) {
#note this function is updated to have a stretching value later!
lx(x+1,a1,a2,a3,b1,b3)/lx(x,a1,a2,a3,b1,b3)
}
ages <- 0:60
plot(ages,px(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Age specific survival rate",ylab="Survival",xlab="Age",ylim=c(0,1))
lines(ages,px(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.05)
(note this is just another realization of the first plot above!)
Some additional related functions but which we will not use as much are those describing the age specific mortality function (qx
=1-px
)
ages <- 0:60
plot(ages,qx(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Age specific mortality rate",ylab="Mortality",xlab="Age",ylim=c(0,1))
lines(ages,qx(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.05)
We can also evaluate he proportion of animals dying in age class x, dx
# Number dying in age class x
dx <- function(x,a1,a2,a3,b1,b3,r) {
exp(-r*x)*lx(x,a1,a2,a3,b1,b3)*qx(x,a1,a2,a3,b1,b3)
}
ages <- 0:60
plot(ages,dx(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],pf[1,6]),type="l",lty=1,main="Number dying in age class",ylab="Number dying",xlab="Age",ylim=c(0,1))
lines(ages,dx(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],pm[1,6]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.05)
We refer loosely to the notion of extending a function to the otherwise loose concept of “going from one function to the other” by changing some feature(s) on the original function to reach a target statistic (or set of statistics) via the “extended” function. This stretching might correspond to stretching, shrinking, increasing, decreasing, time warping, etc, and sometimes these different expressions might be used depending on context.
Specifically, we are here referring to “extending” the Siler model estimated on BND BSE to be a reasonable approximation to a description of survival probabilities for other species.
As LT correctly pointed out, extending the function lx
is straightforward, but the function px
is a bit harder, because that is a quantity derived from the lx
applied in two subsequent years.
In other words, px
is the survival in year \(x\), and that is obtained by the quotient of the proportion of animals that reach age \(x+1\) and those that reach age \(x\).
\[p(x)=\frac{l(x+1)}{l(x)}\]
So when one scales the function over time, one needs also to correspondingly stretch what the comparison between a year and the next is.
The best way to cope for this is perhaps to re-code the functions px
, such that it has explicitly a scaling, that under no stretching is 1 (and is 1 by default).
# Age specific mortality rate
px <- function(x,a1,a2,a3,b1,b3,scaling=1) {
lx(x+scaling,a1,a2,a3,b1,b3)/lx(x,a1,a2,a3,b1,b3)
}
We want to extend this function, that was estimated for a given species, to another species. Assume (these are quick and dirty approximations, but ballpark correct) that the mean age of a dolphin at death is 14 years. For a sperm whale that value increases to about 25 years (24.85 years: see separate R code document “FromStage2AgeModel.R” on how we obtained this number. This is hosted in the folder with the code for the sperm whale analysis included in the GOMOSES 2020 poster.). Then we could conceivable visualize the effect of extending the functions by a factor of 25/14. Lets see what that corresponds to for both the probability of being alive at a given age (lx
) and the survival at year x (px
).
First, we do it by changing the time explicitly:
par(mfrow=c(1,2))
#the scaling factor, as explained above (this is an aproximation based on a gross integers rounding, just for the purposes of this working document)
#original age value/target age value
sca1 <- 14/25
#proportion alive at age
plot(ages/sca1,lx(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",xlim=c(0,120))
lines(ages/sca1,lx(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=1:2, inset=0.02)
#survival px - note the use of the scaling argument in "px"
plot(ages/sca1,px(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=sca1),type="l",lty=1,main="Survival probability at age x",ylab="Survival",xlab="Age",xlim=c(0,120),ylim=c(0,1))
lines(ages/sca1,px(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=sca1),lty=2)
legend("bottomleft",legend=c("Males","Females"),lty=1:2, inset=0.02)
Or equivalently, not changing the data on the plot but the x argument in the functions themselves, which means that for expanding by a factor c
one needs to re-scale the function f(x)
to f(x*K)
. Note that If c
is greater than one the function will undergo horizontal shrinking, and if c
is less than one the function will undergo horizontal stretching. (see also https://courses.lumenlearning.com/boundless-algebra/chapter/transformations/ for extra details and thoughts on function stretching etc)
#define the new maximum age for the extended function
nma <- max(ages)/sca1
#the range of ages
ager <- 0:nma
par(mfrow=c(1,2))
# Proportion surviving to age x
plot(ager,lx(ager*sca1,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",xlim=c(0,120))
lines(ager,lx(ager*sca1,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=1:2, inset=0.02)
# Survival at age x
plot(ager,px(ager*sca1,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=sca1),type="l",lty=1,main="Survival at age x",ylab="Survival",xlab="Age",ylim=c(0,1),xlim=c(0,120))
lines(ager,px(ager*sca1,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=sca1),lty=2)
legend("bottomleft",legend=c("Males","Females"),lty=1:2, inset=0.02)
What we want is to find what is the correct value of c
that leads from a mean age at death of 14 years to 24 years.
Note that one can easily set up a large number of age classes and then use the same approach to evaluate what the corresponding functions would look like for different degrees of stretching. In this way one could easily “get ready” for dealing with different species with different mean ages at death. The grey lines in the plot below represent intermediate degrees of stretching, from no stretching (\(c=1\)) to a \(c\) of 0.5 (solid black lines)
stretching <- seq(0.51,0.99,by=0.01)
#survival
#females
par(mfrow=c(1,2))
plot(0:150,lx((0:150)*0.5,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Females surviving to age x",ylab="Proportion",xlab="Age")
lines(0:150,lx((0:150),pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]))
for(i in 1:length(stretching)){
lines(0:150,lx((0:150)*stretching[i],pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),col="grey")
}
#males
plot(0:150,lx((0:150)*0.5,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),type="l",lty=1,main="Males surviving to age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(0:150,lx((0:150),pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]))
for(i in 1:length(stretching)){
lines(0:150,lx((0:150)*stretching[i],pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),col="grey")
}
The corresponding survival probabilities are also shown below.
stretching <- seq(0.51,0.99,by=0.01)
#survival
#females
par(mfrow=c(1,2))
plot(0:150,px((0:150)*0.5,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=0.5),type="l",lty=1,main="Female survival at age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(0:150,px((0:150),pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]))
for(i in 1:length(stretching)){
lines(0:150,px((0:150)*stretching[i],pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=stretching[i]),col="grey")
}
#males
plot(0:150,px((0:150)*0.5,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=0.5),type="l",lty=1,main="Male survival at age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(0:150,px((0:150),pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]))
for(i in 1:length(stretching)){
lines(0:150,px((0:150)*stretching[i],pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=stretching[i]),col="grey")
}
This section is comprised of tho subsections. On the first we look at the scaling of the survival functions based on the mean age at death (MAD), while on the second we look at age of first reproduction (AFR).
The paper Lahdenpera et al. (2018) presents survival estimates from a sample of elephants, also considering a Siler model. A minor difference is that these authors included covariates on the Siler model parameters, and we ignore that aspect here.
A way to see if the above procedure is somewhat reasonable (hard to know upfront what this means, as we are comparing apples and oranges- well literally, dolphins and elephants) is to check if we manage to extend the dolphins survivals to match the elephants key age statistics. The resulting survivorship functions for the elephants are shown in Figure . We can see that these long lived animals will typically outlive dolphins.
Screenshot of figure 4 in Lahdenpera et al 2018. Survivorship functions resulting from the fitted Siler models.
However, at first sight, what seems obvious to me is that while the survivorship in the dolphins is concave, in the elephants it is convex. And this is something that our “expanding” method can’t cope with, because a concave function will not turn convex by “expansion”, or vice versa. (see also question 4).
Lahdenpera et al. (2018) also present mortality rates(corresponding to our function qx, reproduced below :
Screenshot of Figure 1 in Lahdenpera et al 2018.
To facilitate parameter matching, the parametrization used in the elephant model is shown here (note we are not considering the second line - that is only involved with additional covariates, not present for the dolphins - hence only the first line applies here)
Screenshot of Siler model parametrization in Lahdenpera et al 2018.
and that used in Lori’s paper is here:
Screenshot of Siler model parametrization in Schwacke et al 2017.
It is key to note that while \(p\) from the elephant paper (iEp) represents the hazard or mortality rates, i.e. the probability of dying at a given age that we represented by qx, in Lori’s paper (iLp) \(l_S(x)\) represents the probability of surviving up to age x, our lx function. These funtions are related as
\[ 1-\frac{l(x+1)}{l(x)}=1-p(x)=q(x)= in~Lori's~paper~is~equivalent~to~the~Elephant's ~paper~=p \]
A source of potential confusion is worth bringing upfront: there are two functions denoted as \(p\), and the Elephant’s paper \(p\) corresponds to Lori code \(q(x)\), not \(p(x)\) which actually corresponds to \(1-q(x)\).
Note therefore that, irrespective of there being reparametrizations, we have the following connections:
First, we need to read in the model parameter estimates from the Elephants. From the supplementary material in the elephant paper we have estimates for the Siler model parameters for both males and females (there are actually a large number of parameters, model averaging intricacies, etc, but I ignore these for now). This is shown in .
Screenshot of Supplementary Table 1 in Lahdenpera et al 2018. Siler model parameter estimates.
Now we use these parameters (note explicitly we are considering the \(w_3\) for the year 2000 cohort, just as an example) and check that we can get results that look like those presented in the Elephant paper.
#first row females
#second row males
eleSp<-data.frame(w1=c(0.0807,0.10465),w2=c(0.00101,0.0027),w3=c(0.00426,0.00426),b1=c(0.34585,0.30425),b2=c(0.07656,0.06513),row.names = c("females","males"))
kable(eleSp)
w1 | w2 | w3 | b1 | b2 | |
---|---|---|---|---|---|
females | 0.08070 | 0.00101 | 0.00426 | 0.34585 | 0.07656 |
males | 0.10465 | 0.00270 | 0.00426 | 0.30425 | 0.06513 |
We can now plot the survivorship function:
ages <- 0:60
plot(ages,lx(ages,eleSp[1,1],eleSp[1,3],eleSp[1,2],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(ages,lx(ages,eleSp[2,1],eleSp[2,3],eleSp[2,2],eleSp[2,4],eleSp[2,5]),lty=2)
legend("bottomleft",legend=c("Males","Females"),lty=2:1, inset=0.05)
Something went terribly wrong, as these are much larger survivorship values than what is presented in the image above. Might I have misunderstood what parameter is what across parametrizations?
We can plot equation 1 in the elephant paper (see screensot above) using their function p
p <- function(x,w1,w2,w3,b1,b2){
w1*exp(-b1*x)+w2*exp(b2*x)+w3}
ages <- 0:60
plot(ages,p(ages,eleSp[1,1],eleSp[1,2],eleSp[1,3],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Equation 1 Elephant",ylab="Proportion",xlab="Age",ylim=c(0,0.15))
lines(ages,p(ages,eleSp[2,1],eleSp[2,2],eleSp[2,3],eleSp[2,4],eleSp[2,5]),lty=2)
legend("bottomright",legend=c("Males","Females"),lty=2:1, inset=0.02)
and that seems to add up well to what they have in the paper (again, note different baseline so not really the same necessarily).
Now we plot the age specific mortality rate, using Lori’s plotting code qx
, with the Elephant parameter values.
ages <- 0:60
plot(ages,qx(ages,eleSp[1,1],eleSp[1,3],eleSp[1,2],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Mortality rate at age x",ylab="Proportion",xlab="Age",ylim=c(0,0.15))
lines(ages,qx(ages,eleSp[2,1],eleSp[2,3],eleSp[2,2],eleSp[2,4],eleSp[2,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.02)
This is much lower and does not add up, which confirms a likely an issue with a mix up in the parametrization across the two papers.
Back to the blackboard, considering the expressions in Siler’s paper,
Screenshot of Figure 1 in Lahdenpera et al 2018.
a slight reparametrization might be at stake, where \(a_1\) is divided by \(b_1\), and \(a_3\) is divided by \(b_3\), and so
\[a1=\frac{w1}{b1}\] \[a3=\frac{w2}{b2}\]
We can now check the Mortality rate, using the Elephant parameter values and Lori’s code with the new parametrization:
ages <- 0:60
plot(ages,qx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Mortality rate at age x",ylab="Proportion",xlab="Age",ylim=c(0,0.15))
lines(ages,qx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.02)
And now the lx
function, i.e. the proportion surviving to a given age, and that seems now to add up.
ages <- 0:60
plot(ages,lx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(ages,lx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.02)
It seems like all is good, and we probably are where we want to be, but it would be nice to demonstrate that if one starts at
\[ 1-\frac{l(x+1)}{l(x)}=1-p(x)=q(x)= in~Lori's~paper~is~equivalent~to~the~Elephant's ~paper~=p \]
and then check the parameters from p correspond to the reparametrizations found.
In any case, we can do it empirically, by ploting in the same plot the mortality function with the elephant parameter values using both p
and qx
. These should line up once we have the correct parametrization.
par(mfrow=c(1,1))
ages <- 0:60
#use Elephant formulation
plot(ages,p(ages,eleSp[1,1],eleSp[1,2],eleSp[1,3],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Equation 1 Elephant (p) vs. Lori's q(x) ",ylab="Proportion",xlab="Age",ylim=c(0,0.15),col=3)
lines(ages,p(ages,eleSp[2,1],eleSp[2,2],eleSp[2,3],eleSp[2,4],eleSp[2,5]),lty=2,col=3)
#Use Lori's formulation with Elephants parameters
lines(ages,qx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Hazard at age x",ylab="Proportion",xlab="Age",ylim=c(0,0.15),col=4)
lines(ages,qx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2,col=4)
legend("top",legend=c("Males p","Females p","Males q(x)","Females q(x)"),lty=c(2,1,2,1),col=c(3,3,4,4), inset=0.02)
This lines up pretty well, so the change in parametrization seems adequate. Nonetheless, see the pending question below.
The result in green (via p
) and blue (via qx
) is almost the same, but it should be exactly the same. Is this an issue with some numerical approximation gone wrong? For the time being we proceed assuming the above is correct but there is something that is not really adding up in this. The consequences are likely negligible, but it would be great to find out the reason causing this slight discrepancy.
This is in fact a consequence of the discretization. This equivalence
\[ 1-\frac{l(x+1)}{l(x)}\] implies a discretization. We therefore have an approximation, and then, the approximation is worse for the extremes of the function (as we see in practice in the plot above). This confirms that we are doing the right thing. The functions \(p\) and \(qx\) would be equivalent only in continous time, but they are discretizing a continous function and hence produce a mismatch from the approximation, by being essentially just joining the dots at observations over discretizations year by year.
End of Question 2
We can evaluate what would be the average life span of an Elephant. Given the probability of an elephant reaching a given age x is provided by \(l(x)\), as in the image below
ages <- 0:100
plot(ages,lx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",ylim=c(0,1))
lines(ages,lx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2)
legend("topright",legend=c("Males","Females"),lty=2:1, inset=0.02)
then the mean age of an elephant will be given by the integral under that function. Assuming that we have 1:1 sex ratio, the mean age is the mean age of males and females.
MeanAgeEmale <- integrate(lx,lower=0,upper=100,a1=eleSp[1,1]/eleSp[1,4],a2=eleSp[1,3],a3=eleSp[1,2]/eleSp[1,5],b1=eleSp[1,4],b3= eleSp[1,5])$value
MeanAgeEfemale <- integrate(lx,lower=0,upper=100,a1=eleSp[2,1]/eleSp[2,4],a2=eleSp[2,3],a3=eleSp[2,2]/eleSp[2,5],b1=eleSp[2,4],b3= eleSp[2,5])$value
MeanAgeE <- (MeanAgeEmale+MeanAgeEfemale)/2
This corresponds to a mean age of Male Elephants of 35.784 and of Female Elephants of 28.005, resulting in an overall mean age under a 1:1 sex ratio assumption of 31.895.
Just to check that we get a similar mean age using the simulation approach considered above
#maximum age
maxage <- 150
#range of ages
ages<-0:maxage
#calculate stretched survivals
survsF <- px(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5])
survsM <- px(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5])
#because the px function is a division, if both numerator and denominator are 0 we get a NaN
survsF[is.nan(survsF)]<-0
survsM[is.nan(survsM)]<-0
#build Survival matrix
Tr <- matrix(0,nrow=maxage*2,ncol=maxage*2)
for(i in 1:maxage){
for (i in 1:(maxage-1)){
Tr[i+1,i] <- survsF[i]
Tr[maxage+i+1,maxage+i] <- survsM[i]
}}
#define inital population
nstart <- 500000
Nstart=c(nstart/2,rep(0,maxage-1),nstart/2,rep(0,maxage-1))
#propagate it forward
years <- 200
Ns<-matrix(0,nrow=maxage*2,ncol=years)
Ns[,1] <- Nstart
for(i in 2:years){
Ns[,i] <- Tr %*% Ns[,i-1]
}
#observed ages at death
ages.at.death <- rep(1:(years-1),times=abs(diff(colSums(Ns))))
#mean age at death
mean.ages<- mean(ages.at.death)
mean.ages
## [1] 32.40203
This is a difference of 0.51. Half a year… is this a coincidence, or am I doing something wrong in the approximation by simulation? I actually think that it is the latter, because as currently implemented by simulation the minimum age one records an animal dying at is 1, i.e. you record the year the animal would have in it’s next birthday after dying, which will be on average about half a year later than its actual age, assuming uniform deaths during a year.
So now, let’s stretch dolphins to become elephants… (I always wanted to say that… it’s a remarkable Lamark remark :) - whether this is a direct consequence of social isolation for too long is unclear. But it’s not a big stretch to say so!
We can essentially redo what we attempted above, but now knowing we are aiming for a mean age at death of 31.89. That essentially means more “intense” stretching than before.
stretching <- seq(0.11,0.99,by=0.01)
maxage <- 300
#survival
#females
plot(0:maxage,lx((0:maxage)*0.5,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Females surviving to age x",ylab="Proportion",xlab="Age")
lines(0:maxage,lx((0:maxage),pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]))
for(i in 1:length(stretching)){
lines(0:maxage,lx((0:maxage)*stretching[i],pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),col="grey")
}
ls <- length(stretching)
#vector to hold mean age after stretching
mean.ages <- numeric(ls)
#range of ages
ages<-0:maxage
for(s in 1:ls){
#calculate stretched survivals
survsF <- px(ages[-maxage]*stretching[s],pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=stretching[s])
survsM <- px(ages[-maxage]*stretching[s],pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=stretching[s])
#because the px function is a division, if both numerator and denominator are 0 we get a NaN
survsF[is.nan(survsF)]<-0
survsM[is.nan(survsM)]<-0
#build Survival matrix
Tr <- matrix(0,nrow=maxage*2,ncol=maxage*2)
for(i in 1:maxage){
for (i in 1:(maxage-1)){
Tr[i+1,i] <- survsF[i]
Tr[maxage+i+1,maxage+i] <- survsM[i]
}
}
#define inital population
nstart <- 500000
Nstart=c(nstart/2,rep(0,maxage-1),nstart/2,rep(0,maxage-1))
#propagate it forward
years <- 400
Ns<-matrix(0,nrow=maxage*2,ncol=years)
Ns[,1] <- Nstart
for(i in 2:years){
Ns[,i] <- Tr %*% Ns[,i-1]
}
#see the pop size as a function of time
#plot(1:years,colSums(Ns)/nstart,ylim=c(0,1),xlab="year",ylab="P(survival up to age)")
#observed ages at death
ages.at.death <- rep(1:(years-1),times=abs(diff(colSums(Ns))))
#mean age at death
mean.ages[s] <- mean(ages.at.death)
}
#see the result
plot(stretching,mean.ages)
We proceed with the comparison using the the direct integration method. This is more computationally efficient any way and seems to avoid a couple of issues (the age being incorrectly computed by half a year - see above - and this current unexpected result most likely due to some bug in the code).
We define a new lx
function, called lxstretch
, that can be numerically integrated by just adjusting the x’s.
#the required function
# Proportion surviving to age x
lxstretch <- function(x,stretch,a1,a2,a3,b1,b3) {
exp(-a2*x*stretch) * exp((-a1)*(1-exp(-b1*x*stretch))) * exp(a3*(1-exp(b3*x*stretch)))
}
stretching <- seq(0.11,0.99,by=0.01)
ls <- length(stretching)
#vector to hold mean age after stretching
mean.agesI <- numeric(ls)
for(s in 1:ls){
MeanAgeDmale <- integrate(lxstretch,lower=0,upper=500,stretch=stretching[s],a1=pf[1,1],a2=pf[1,2],a3=pf[1,3],b1=pf[1,4],b3= pf[1,5])$value
MeanAgeDfemale <- integrate(lxstretch,lower=0,upper=500,stretch=stretching[s],a1=pm[1,1],a2=pm[1,2],a3=pm[1,3],b1=pm[1,4],b3= pm[1,5])$value
MeanAgeD <- (MeanAgeDmale+MeanAgeDfemale)/2
mean.agesI[s] <- MeanAgeD}
Below we show the mean age of an animal from a population after stretching, with dashed lines representing the stretching required for:
#see the result
plot(stretching,mean.agesI,ylim=c(10,80))
abline(h=MeanAgeE,lty=2)
abline(v=0.46,lty=2)
abline(h=24.85,lty=3)
abline(v=0.6,lty=3)
legend("topright",legend=c("Elephant","Sperm whale"),lty=2:3,inset=0.05)
So the corresponding look and feel for the proportion of animals surviving to a given age, considering both
is shown in the figure below
ages <- 0:150
# Proportion surviving to age x
plot(ages,lx(ages*0.46,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",col=4)
lines(ages,lx(ages*0.46,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=4)
lines(ages,lx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),col=3)
lines(ages,lx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2,col=3)
legend("top",legend=c("Males E","Females E","Males D2E","Females D2E"),lty=c(2,1,2,1),col=c(3,3,4,4), inset=0.02)
The conclusion of this exercise is somewhat expected. While we manage to extend the dolphins function, because the dolphins function had higher values for low age survival probabilities, and the convex shape, the only way we manage to extend the function to get the same mean age is by overestimating the ability to get really long lived animals, which is a consequence from a wider plateau in survival probabilities for adults. This is visible in the corresponding survival functions
ages <- 0:200
# Proportion surviving to age x
plot(ages,px(ages*0.46,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=0.46),type="l",lty=1,main="Survival probability at age x",ylab="Proportion",xlab="Age",col=4,ylim=c(0,1))
lines(ages,px(ages*0.46,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=0.46),lty=2,col=4)
lines(ages,px(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),col=3)
lines(ages,px(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2,col=3)
legend("bottomleft",legend=c("Males E","Females E","Males D2E","Females D2E"),lty=c(2,1,2,1),col=c(3,3,4,4), inset=0.02)
Just for an additional comparison, below is the comparison of the dolphin model and its extended version
ages <- 0:200
# Proportion surviving to age x
plot(ages,px(ages*0.46,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=0.46),type="l",lty=1,main="Survival probability at age x",ylab="Proportion",xlab="Age",col=4,ylim=c(0,1))
lines(ages,px(ages*0.46,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=0.46),lty=2,col=4)
lines(ages,px(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),col=3)
lines(ages,px(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=3)
legend("bottomleft",legend=c("Males E","Females E","Males D","Females D"),lty=c(2,1,2,1),col=c(4,4,3,3), inset=0.02)
Note that there is only so much that you can “tweak” a function by extending or contracting the survivorship function lx
.
Originally TAM thought that this might imply that simply extending the function in this way is not enough, on the misleading intuition that you could not go above the maximum survival probability from the original function. If the maximum survival probability, conditional on age, was say \(p_{max}\), then extending it this way, the new function would not have any age with a survival probability higher than said \(p_{max}\). LT showed TAM this is not correct! We can reach a higher \(p_max\). There are nonetheless constraints in expanding lx
“sideways”.
Note that there are correlation structures between all these related statistics, \(p_{max}\) as an example, but many others like the mean age at death (MAD), or age at first reproduction (AFR), or gestation duration (GD), or longevity, etc. As an example, higher longevity species might have higher \(p_{max}\) than short lived ones. Expanding the survivorship function sideways does not quite cut it, because such correlation structures are ignored. In non-technical terms, one might need to extend it upwards too, say. Similarly, if one were to contract the function, one might want to shrink it down.
We discussed internally about how a 2 degrees of freedom warping might be needed to do this scaling realistically, but also how we simply do not have enough information to do it. So for the time being, this is a question that will be moved to solved on the basis that we believe it is an important consideration - might me mentioned in some discussion paragraph in some paper - but for now we move past it and assume we are doing the best we can.
End of Question 4
The proportion surviving to age x corresponding for “dolphins extended to be sperm whales”, as well as survival probabilities at age x, using the mean age of 24.85 as described above, is shown below. For comparison, the values for the stage specific survival probabilities considered for the sperm whales are shown as dashed black lines
ages <- 0:100
# Proportion surviving to age x
plot(ages,lx(ages*0.6,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="Proportion surviving to age x",ylab="Proportion",xlab="Age",col=4)
lines(ages,lx(ages*0.6,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=4)
legend("top",legend=c("Males Pmac","Females Pmac"),lty=2,col=4, inset=0.02)
# Survival at age x
ages <- 0:200
plot(ages,px(ages*0.6,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=0.6),type="l",lty=1,main="Survival probability at age x",ylab="Proportion",xlab="Age",col=4,ylim=c(0,1))
lines(ages,px(ages*0.6,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=0.6),lty=2,col=4)
legend("bottomleft",legend=c("Males Pmac","Females Pmac","Stage structured values"),lty=c(1,2,2),col=c(4,4,1), inset=0.02,lwd=c(1,1,3))
#from Lance's files
PmacScalf <- 0.907
PmacSjuv <- 0.9424
PmacSFemale <- 0.9777
PmacSMale <- 0.918
#add lines
segments(x0=0,x1=2,y0=PmacScalf,y1=PmacScalf,lty=2,lwd=3)
segments(x0=2,x1=9,y0=PmacSjuv,y1=PmacSjuv,lty=2,lwd=3)
segments(x0=9,x1=100,y0=PmacSFemale,y1=PmacSFemale,lty=1,lwd=3)
segments(x0=9,x1=100,y0=PmacSMale,y1=PmacSMale,lty=2,lwd=3)
For an additional comparison, below is the comparison of the dolphin model and the sperm whale model
ages <- 0:200
# Survival probability at age x - Pmac
plot(ages,px(ages*0.6,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5],scaling=0.6),type="l",lty=1,main="Survival probability at age x",ylab="Proportion",xlab="Age",col=4,ylim=c(0,1))
lines(ages,px(ages*0.6,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5],scaling=0.6),lty=2,col=4)
# Survival probability at age x - dolphins
lines(ages,px(ages,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),col=3)
lines(ages,px(ages,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=3)
legend("topright",legend=c("Males Pmac","Females Pmac","Males D","Females D"),lty=c(2,1,2,1),col=c(4,4,3,3), inset=0.02)
Motivated by the email that LS sent Sun 4/12/2020 4:21 AM, we plot below the corresponding lx
function taking the stage specific survival probabilities and associating them to each age - note there is a slight mismatch in doing it this way, as this is not a direct 1:1 translation from a stage to an age strutcured model.
PmacPaliveF=c(PmacScalf,PmacScalf,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,rep(PmacSFemale,141))
PmacPaliveM=c(PmacScalf,PmacScalf,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,PmacSjuv,rep(PmacSMale,141))
PmacPaliveF=cumprod(PmacPaliveF)
PmacPaliveM=cumprod(PmacPaliveM)
#the Pmac lx from the age structured survivals
plot(1:150,PmacPaliveF,col="black",type="l",ylim=c(0,1),main="Proportion surviving to age x",ylab="Proportion",xlab="Age")
lines(1:150,PmacPaliveM,lty=2,col="black")
#now the Pmac after streching
lines(ages,lx(ages*0.6,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,col=4)
lines(ages,lx(ages*0.6,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=4)
legend("topright",legend=c("Males Pmac age extended","Females Pmac age extended","Males stage structured","Females stage structured"),lty=c(2,1,2,1),col=c(4,4,1,1), inset=0.02)
What the above figure represents relates to what LS reported. The median age for Pmac (using an age structured model) is actually lower than that of dolphins (using the age structured model). When we consider the mean age, that is larger in the Pmac than in the BSE BND model. This comes at the cost of there being a considerable probability of animals reaching unrealistically old ages under the sperm whale survival values used in the stage structured model.
(Note: some potentially useful additional insights and code for plots on this topic are present in the file “FromStage2AgeModel.R” under the Pmac analysis folder used for the GOMOSES 2020 poster)
Not sure this approximation is good at all. Need to think further about this! But the truth is that we are not exactly sure how to scale things because we don’t know exactly what the survivals par age would be, if we knew, no need for scaling. So this is an open ended question I am afraid.
End of Question 8
Given discussions within CARMMHA, we also look here at the possibility of scaling the functions based on age at first reproduction (AFR) and gestation duration (GD). We look into those options in this subsection.
As before, we can use the Elephant data to illustrate the procedure. Based on Lahdenpera et al. (2014) and Hayward et al. (2014), we obtain some information about age at first reproduction for Elephants. From the later reference: “The youngest female breeding in the sample was 5 years old and the oldest was 53 years old (mean age at first birth: 19.48; median: 19 years)”. We therefore consider 19.48 as an example. Further, Kajaysri & Nokkaew (2014) refers “Although the number of elephants (5 pregnant elephants) and the frequency of sampling collections (1 day per month) were limited, the present study could demonstrate a gestation length of 21.5 to 22 months, similar to that found by other studies [2, 16, 17].”. Below we consider 21.5 months as an example, with 30.5 days per month.
Given that, instead of the scaling of 0.46 given MAD considered above, assuming
The results at the level of lx
are shown below. These are all quite consistent actually, which I guess is good news, in some sense. AFR leads to the stronger level of stretching, GD to the weakest, and MAD would correspond to an intermediate level of stretching.
ages <- 0:150
# Proportion surviving to age x
# scaling by mean age
par(mar=c(4,4,0.1,0.1))
plot(ages,lx(ages*0.46,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,main="",ylab="Proportion surviving",xlab="Age",col=4)
lines(ages,lx(ages*0.46,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=4)
#real
lines(ages,lx(ages,eleSp[1,1]/eleSp[1,4],eleSp[1,3],eleSp[1,2]/eleSp[1,5],eleSp[1,4],eleSp[1,5]),col=3)
lines(ages,lx(ages,eleSp[2,1]/eleSp[2,4],eleSp[2,3],eleSp[2,2]/eleSp[2,5],eleSp[2,4],eleSp[2,5]),lty=2,col=3)
#scaling by ADR
lines(ages,lx(ages*scalingAFR,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,col=5)
lines(ages,lx(ages*scalingAFR,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=5)
#scaling by GD
lines(ages,lx(ages*scalingGD,pf[1,1],pf[1,2],pf[1,3],pf[1,4],pf[1,5]),type="l",lty=1,col=6)
lines(ages,lx(ages*scalingGD,pm[1,1],pm[1,2],pm[1,3],pm[1,4],pm[1,5]),lty=2,col=6)
legend("topright",legend=c("Males E","Females E","Males D2E MAD","Females D2E MAD","Males D2E AFR","Females D2E AFR","Males D2E GD","Females D2E GD"),lty=c(2,1,2,1,2,1,2,1),col=c(3,3,4,4,5,5,6,6), inset=0.02)
Irrespectively of the quantity or its values for either the dolphins or the sperm whale (or any other species for that matter!), I am setting up here a workflow imagining that we will get these soon (LS, LT and CB all working on it activelly). As an ilustration, I am using the values for AFR, but that is irrelevant re the worflow. Then it will be just a matter of replacing the scaling statistic and the corresponding mock up values that we set up here.
A key aspect of this workflow is to incorporate the variability in the Siler model parameters posterior distribution. That variability is depicted in the following figure, where the mean survivorship function (lx
) is shown together with the corresponding 95% credible intervals. We note that we have decided also for the time being not to incorporate variability in the scaling in the process. So while we we will propagate the variability coming from the Siler model, we will ignore the variability on the scaling parameter.
ages <- 0:70
ncols <- nrow(pf)
nrows <- length(ages)
library(ggplot2)
#get a dataframe to use ggplot2
#objects to hold female and male realizations
lxFs <- matrix(NA,nrow=nrows,ncol=ncols)
lxMs <- matrix(NA,nrow=nrows,ncol=ncols)
#for each obseration of the posterior
for(i in 1:nrow(pf)){
#get the funtion
lxFs[,i] <- lx(ages,pf[i,1],pf[i,2],pf[i,3],pf[i,4],pf[i,5])
lxMs[,i] <- lx(ages,pm[i,1],pm[i,2],pm[i,3],pm[i,4],pm[i,5])
}
#arrange as data frame
lxFs2 <- data.frame(ages=ages,lxFs)
lxMs2 <- data.frame(ages=ages,lxMs)
# add means and relevant quantiles
lxFs2$mean=rowMeans(lxFs)
lxMs2$mean=rowMeans(lxMs)
lxFs2$q025=apply(X=lxFs,MARGIN = 1,FUN=quantile,probs=0.025)
lxMs2$q025=apply(X=lxMs,MARGIN = 1,FUN=quantile,probs=0.025)
lxFs2$q975=apply(X=lxFs,MARGIN = 1,FUN=quantile,probs=0.975)
lxMs2$q975=apply(X=lxMs,MARGIN = 1,FUN=quantile,probs=0.975)
#add the sex
lxFs2$sex="Female"
lxMs2$sex="Male"
#make single object
lxMF <- rbind(lxFs2,lxMs2)
#plot
#https://www.r-graph-gallery.com/104-plot-lines-with-error-envelopes-ggplot2.html
ggplot(data=lxMF, aes(x=ages, y=mean, ymin=q025, ymax=q975, fill=sex, linetype=sex)) + geom_line() +
geom_ribbon(alpha=0.5) +
xlab("Age") +
ylab("Proportion surviving")
We assume for now that the mean and standard deviation of Gaussian random variables for Pmac and Ttru describing their respective AFR is 8 and 0.8 for Ttru and 12 and 1.2 for Pmac. In other words, Pmac have a 1/3 larger mean AFR, and with a proportinal increase in its standard deviation, as illustrated below (no claim this is reasonable is made for now!!!).
ages <- seq(0,25,by=0.01)
plot(ages,dnorm(ages,AFRTtru.mean,AFRTtru.sd),xlab="Age at first reproduction",ylab="f(AFR)",col=3,type="l",lwd=2)
lines(ages,dnorm(ages,AFRPmac.mean,AFRPmac.sd),col=4,lwd=2)
legend("topright",legend=c("Pmac","Ttru"),lty=1,lwd=2,col=c(4,3), inset=0.02)
Given the above, a distribution for the scaling distribution is represented below
#create as many values as we have realizations of the posterior
stretchings <- rnorm(ncols,AFRTtru.mean,AFRTtru.sd)/rnorm(ncols,AFRPmac.mean,AFRPmac.sd)
hist(stretchings, main="",xlab="scaling",breaks=seq(0,max(stretchings)+0.2,by=0.05),freq = FALSE)
Note that every now and then we get a really extreme value, corresponding to a large sampled value for Ttru divided by a very small value for Pmac, but in general the distribution looks sensible
Should we truncate unrealistic values in this distribution? Is that defensible? The issue is that values larger than 1 mean that by using it we would contract the function not expand it. And that migh not be sensible for some species, and it might say more about our inability to account for correlation between the statistics used for scaling in the diffrent species. As an example, for Ttru
and Pmac
, the AFR should be larger for the later than the former. If we move for gestation duration, for which variability is lower, this might be less of an issue in pratice.
End of Question 10
We can now sample from this distribution and propagate forward the survivorship of the sperm whales assuming the scaling is done via AFR
ages <- 0:150
ncols <- nrow(pf)
nrows <- length(ages)
library(ggplot2)
#get a dataframe to use ggplot2
#objects to hold female and male realizations
lxFsPmac <- matrix(NA,nrow=nrows,ncol=ncols)
lxMsPmac <- matrix(NA,nrow=nrows,ncol=ncols)
#for each obseration of the posterior
for(i in 1:nrow(pf)){
#get the funtion
lxFsPmac[,i] <- lx(ages*stretchings[i],pf[i,1],pf[i,2],pf[i,3],pf[i,4],pf[i,5])
lxMsPmac[,i] <- lx(ages*stretchings[i],pm[i,1],pm[i,2],pm[i,3],pm[i,4],pm[i,5])
}
#arrange as data frame
lxFsPmac2 <- data.frame(ages=ages,lxFsPmac)
lxMsPmac2 <- data.frame(ages=ages,lxMsPmac)
# add means and relevant quantiles
lxFsPmac2$mean=rowMeans(lxFsPmac)
lxMsPmac2$mean=rowMeans(lxMsPmac)
lxFsPmac2$q025=apply(X=lxFsPmac,MARGIN = 1,FUN=quantile,probs=0.025)
lxMsPmac2$q025=apply(X=lxMsPmac,MARGIN = 1,FUN=quantile,probs=0.025)
lxFsPmac2$q975=apply(X=lxFsPmac,MARGIN = 1,FUN=quantile,probs=0.975)
lxMsPmac2$q975=apply(X=lxMsPmac,MARGIN = 1,FUN=quantile,probs=0.975)
#add the sex
lxFsPmac2$sex="Female"
lxMsPmac2$sex="Male"
#make single object
lxMFPmac <- rbind(lxFsPmac2,lxMsPmac2)
#plot
#https://www.r-graph-gallery.com/104-plot-lines-with-error-envelopes-ggplot2.html
ggplot(data=lxMFPmac, aes(x=ages, y=mean, ymin=q025, ymax=q975, fill=sex, linetype=sex)) + geom_line() +
geom_ribbon(alpha=0.5) +
xlab("Age") +
ylab("Proportion surviving") + geom_line(data=lxMF,aes(x=ages, y=mean))
For comparison, the analogous figure but assuming AFR must be higher for Pmac than for Ttru, which as noted in question 10 above, might be more sensible.
stretchings <- ifelse(stretchings<1,stretchings,1)
ages <- 0:150
ncols <- nrow(pf)
nrows <- length(ages)
library(ggplot2)
#get a dataframe to use ggplot2
#objects to hold female and male realizations
lxFsPmac <- matrix(NA,nrow=nrows,ncol=ncols)
lxMsPmac <- matrix(NA,nrow=nrows,ncol=ncols)
#for each obseration of the posterior
for(i in 1:nrow(pf)){
#get the funtion
lxFsPmac[,i] <- lx(ages*stretchings[i],pf[i,1],pf[i,2],pf[i,3],pf[i,4],pf[i,5])
lxMsPmac[,i] <- lx(ages*stretchings[i],pm[i,1],pm[i,2],pm[i,3],pm[i,4],pm[i,5])
}
#arrange as data frame
lxFsPmac2 <- data.frame(ages=ages,lxFsPmac)
lxMsPmac2 <- data.frame(ages=ages,lxMsPmac)
# add means and relevant quantiles
lxFsPmac2$mean=rowMeans(lxFsPmac)
lxMsPmac2$mean=rowMeans(lxMsPmac)
lxFsPmac2$q025=apply(X=lxFsPmac,MARGIN = 1,FUN=quantile,probs=0.025)
lxMsPmac2$q025=apply(X=lxMsPmac,MARGIN = 1,FUN=quantile,probs=0.025)
lxFsPmac2$q975=apply(X=lxFsPmac,MARGIN = 1,FUN=quantile,probs=0.975)
lxMsPmac2$q975=apply(X=lxMsPmac,MARGIN = 1,FUN=quantile,probs=0.975)
#add the sex
lxFsPmac2$sex="Female"
lxMsPmac2$sex="Male"
#make single object
lxMFPmac <- rbind(lxFsPmac2,lxMsPmac2)
#plot
#https://www.r-graph-gallery.com/104-plot-lines-with-error-envelopes-ggplot2.html
ggplot(data=lxMFPmac, aes(x=ages, y=mean, ymin=q025, ymax=q975, fill=sex, linetype=sex)) + geom_line() +
geom_ribbon(alpha=0.5) +
xlab("Age") +
ylab("Proportion surviving") + geom_line(data=lxMF,aes(x=ages, y=mean))
While reading Barlow & Boveng 1991 Modeling age-specific mortality for marine mammal populations. Marine Mammal Science 7: 50-65 I realized that actually, Jay has already thought about all of this. How am I not surprised that this was nothing new…!
In the reference section above we actually reproduced some of Jay’s insights and they might be relevant for what we are doing here.
The following expression from Barlow & Boveng 1991
Screenshot of expression in Barlow and Boveng 1991.
relates to a relevant concept required later, the mean age of an individual observed at random from the population. To understand why we need this value the reader would have to read section 2.7 in CARMMHA’s internal document “How to land it - a road map to finishing CARMMHA”. In short, this provides a way to calculate the mean age of an animal sampled at random at the time of the oil spill, which is something that will be required to evaluate how long the recovery will take for animals that do recover from the oil after being exposed.
Given the above, that mean age of an individual would be given by
\[ A_m = E(x)= \int x~f_{pdf}(x) dx = \int x \frac{ l(x) }{\int l(x) dx}dx=\frac{ \int x l(x) dx}{\int l(x) dx}\]
To implement it we first define a function xlx
and then we use it
Amf <- integrate(xlx,lower=0,upper=500,a1=pf[1,1],a2=pf[1,2],a3=pf[1,3],b1=pf[1,4],b3= pf[1,5])$value/integrate(lx,lower=0,upper=500,a1=pf[1,1],a2=pf[1,2],a3=pf[1,3],b1=pf[1,4],b3= pf[1,5])$value
Amm<- integrate(xlx,lower=0,upper=500,a1=pm[1,1],a2=pm[1,2],a3=pm[1,3],b1=pm[1,4],b3= pm[1,5])$value/integrate(lx,lower=0,upper=500,a1=pm[1,1],a2=pm[1,2],a3=pm[1,3],b1=pm[1,4],b3= pm[1,5])$value
#50% sex ratio
Am <- (Amf+Amm)/2
The mean age of an animal taken from the population at any one time, hence at the time of the oil spill, would be 14.3 for females, 11.7 for males, and hence under a 1:1 sex ratio, 13 overall.
Note that, given the number obtained above for the mean age at death (15.3), this poses a question. It seems like on average, at any one time, you only have 2.3 years to live…
This is a bit counter-intuitive…? What is my intuition failing to grasp? And is this related to ignoring fecundity? See also question 6.
End of Question 5
Do we ignore fecundity to estimate mean age at death?
A key thing that seems odd is that while the mean age of an animal at death does not depend on fecundity, the mean age of an animal alive at any one time should depend on fecundity too. Or at the very least, the above expression from Jay depends on it, because the stable age distribution will depend on fecundity, but in the above, we’ve not really used fecundity anywhere.
While this might seems weird at first, it is a fact. The mean age at death for an animal does not depend of fecundity, because it represents the average life span conditional on an animal being born. So whether 1 or 100 animals are born per female is irrelevant, it is the mean age of those born that is of interest. On the contrary, the mean age of an animal sampled at random from the population - at any given time, e.g. when the oil spill occurred - does depend on fecundity too. If fecundity is large, most animals are young, and that mean age goes down. Mean age of an animal depends on the population structure, and that population structure depends on fecundity.
Nonetheless, this can be quite confusing, as an extreme example might illustrate. If each female had 10000 calfs per year, all on January 1st, but all but 1 were dead before being 1 day old, the mean age of an animal observed at random from the population would be probably quite high (as you would probably never see the ones less than 1 day old), depending mostly on survival, but the mean age at death of the animals of the population would be lower than 1 year (and possibly quite close to 1 day!), depending mostly on fecundity. Which is really really confusing as I just said that age at death would not depend on fecundity, but here I twisted the plot.
End of Question 6
The Barlow and Boveng 1991 paper is interesting for another reason, that links to a topic discussed above. As per the screenshot below, the curve for human females, with the longest longevity (81 years, 34 for monkeys and 18 for seals), looks convex, as for the elephants.
Is that a “feature” of longer lived animals, and if so, is our extending the function approach not capable to capture that?
Screenshot of figure 1 in Barlow and Boveng 1991.
End of Question 7
Questions are separated by solved (and kept here just for reference) or open. Both types of questions are subdivided by their relevance (see details below).
These have been solved in the mean time, but we keep them here for future reference:
How to incorporate the variability in the scaling when extending the function? Back to question 1
Why extending it the way we are doing it might not quite make it? Back to question 4
Should we truncate the scaling distribution to match our expectations in terms of the direction of the scaling (i.e. stretching or shrinking)? Back to question 10
Why are the blue and green functions not exactly the same? Back to question 2
Do we ignore fecundity to estimate mean age at death? Back to question 6
Are longed lived species different from the others? Back to question 7
Scale by a mean value or scale by an extreme quantile. I wonder what would Jay say? Back to question 9
Is the age structured model sensible given the stage structured stage-conditional survival values? Back to question 8
All solved for now.
Silvio Velosa (Universidade da Madeira) provided extremely useful feedback that allowed us to understand the slight discrepancies between the two formulations, as per question 2, involving a discrete approximation of a continuous function.
Barlow, J. & Boveng, P. (1991). Modeling age-specific mortality for marine mammal populations. Marine Mammal Science, 7, 50–65.
Caswell, H., Brault, S., Read, A.J. & Smith, T.D. (1998). Harbor porpoise and fisheries: An uncertainty analysis of incidental mortality. Ecological Applications, 8, 1226–1238.
Eakin, T. (1994). Intrinsic time scaling in survival analysis: Application to biological populations. Bulletin of Mathematical Biology, 56, 1121–1141.
Hayward, A.D., Mar, K.U., Lahdenpera, M. & Lummaa, V. (2014). Early reproductive investment, senescence and lifetime reproductive success in female asian elephants. Journal of Evolutionary Biology, 27, 772–783.
Kajaysri, J. & Nokkaew, W. (2014). Assessment of pregnancy status of asian elephants (elephas maximus) by measurement of progestagen and glucocorticoid and their metabolite concentrations in serum and feces, using enzyme immunoassay (EIA). Journal of Veterinary Medical Science, 76, 363–368.
Krementz, D.G., Sauer, J.R. & Nichols, J.D. (1989). Model-based estimates of annual survival rate are preferable to observed maximum lifespan statistics for use in comparative life-history studies. Oikos, 56, 203.
Lahdenpera, M., Mar, K.U., Courtiol, A. & Lummaa, V. (2018). Differences in age-specific mortality between wild-caught and captive-born asian elephants. Nature Communications, 9, 3023.
Lahdenpera, M., Mar, K.U. & Lummaa, V. (2014). Reproductive cessation and post-reproductive lifespan in asian elephants and pre-industrial humans. Frontiers in Zoology, 11.
Schwacke, L.H., Thomas, L., Wells, R.S., McFee, W.E., Hohn, A.A., Mullin, K.D., Zolman, E.S., Quigley, B.M., Rowles, T.K. & Schwacke, J.H. (2017). Quantifying injury to common bottlenose dolphins from the Deepwater Horizon oil spill using an age-, sex- and class-structured population model. Endangered Species Research, 33, 265–279.