Non-linear growth models: Addressing Parameter Underestimation in Limited Data with Nonlinear Models and Bayesian Hierarchical Approaches using STAN applied in Aquaculture

Summary

Motivated to contribute to scientific dissemination, I wrote this post to make it easier for attentive readers to understand how incomplete or limited data can affect non-linear growth models. In this post, we explore solutions to fix these problems (using the brms package: Bayesian Regression Models using ‘Stan’). Let’s apply it to growth data on shrimp produced in aquaculture if you allow me. We will explore the following topics:

  • Nonlinear growth models
  • Challenges posed by incomplete or limited data
  • Bias correction in estimating parameters of nonlinear models
  • Bayesian hierarchical modeling (using the STAN tool)
  • Conclusion

The insights in this piece are drawn from two key scientific articles:

Nonlinear Growth Models

Nonlinear models represent a family of regression models where parameters are strongly correlated, posing implications for parameter inference. These correlations often necessitate specific estimation methods. Statisticians usually classify it as a non-linear model in which the derivatives of the mean function with respect to the parameters depend on one or more of the model’s parameters. What sets nonlinear models apart, is the interpretability of their parameters, which often carry biologically meaningful implications. Below, we present a selection of nonlinear growth models commonly found in applied statistics literature.

Function name Mathematical expression
Morgan-Mercer-Flodin (MMF) \[f(t)= \alpha- \dfrac{\alpha-w_0}{1+(\kappa \cdot 10^{-4} \cdot \;t)^{\delta}} \]
Michaelis-Menten Generalized \[f(t) = \dfrac{w_0 \; \beta^\kappa+\alpha \; t^{\kappa}}{\beta^\kappa+t^{\kappa}} \]
Weibull growth \[f(t)= \alpha\;(1-exp(-\beta \cdot 10^{-4} \cdot \; t^{\kappa} ))+w_0\]
von Bertalanffy \[f(t)= \alpha \;(1-exp(-\kappa \cdot 10^{-4} \cdot \;(t+\beta)))^3\]
Gompertz function \[f(t)= \alpha \; exp(-exp(\kappa \; (\beta-t)))\]
Logistic growth function \[f(t)= \dfrac{\alpha}{1+exp(\kappa \; (\beta-t))}\]
Richards \[f(t) =\alpha \left[ 1+(\gamma-1)e^{-\kappa(t-\beta)} \right]^{\frac{1}{1-\gamma} } \]

It’s important to note that parameters in nonlinear models may not always share the same biological interpretation. Care must be taken when generalizing interpretations across different growth models (I mean, the same Greek letter as a mathematical symbol representing the parameter between models, will not always have the same biological interpretation of the phenomenon studied). In some applications, comparing different parameterizations of the same model is common practice, leading to nuanced differences in parameter biological meanings. To familiarize readers with nonlinear growth models, we’ve prepared a graphical GeoGebra (all models together) analysis to explore various models and their parameters (click the hyperlink on each model in the table and enjoy).

Challenges Posed by Incomplete or Limited Data

A peculiar characteristic that aquaculture data (cultivation of aquatic organisms) has is the fact that it is incomplete or limited (Zarzar et al., 2022). This means that samples (weight over time) are predominantly observed below the inflection point of growth curves. A common pattern in aquaculture production environments, especially in shrimp farming, which is often not worth it for the producer to keep these individuals on the farms for a long time, this implies increasing expenses with feed, physical space, fixed and variable costs. This way, when the animals reach commercial weight, the tank is fished and sold to the market and a new production cycle begins.

Such incomplete or limited data introduce biases in estimating parameters of nonlinear growth models. Mainly one of the most difficult parameters to estimate, the alpha parameter (theoretical asymptotic weight). I don’t know if you noticed, but it is the only common parameter among all non-linear growth models. It is a very important parameter even for estimating other parameters of the model itself. It is theoretical because it represents the individual’s weight at the end of their life (when time has its limit at infinity), which is abstract and impossible to obtain in practice. We know from estimates that it is often peculiar to the species that is studied as a target phenomenon in modeling (different between a fish and an elephant for example). Obviously, with the Bayesian perspective, this parameter will not be fixed, and there is variation from individual to individual.

In our simulation below, we showcase data limitations in Pacific white shrimp (Litopenaeus vannamei) growth over various weeks of cultivation, alongside fits of different nonlinear growth curves. Note that between 7 and 18 weeks there are always underestimations of the parameters and consequently underestimations of the non-linear curves of the proposed models. As the number of observations increases, the fit of the models also improves.

##### Gráfico para artigo simulação #####
#----------------------#----------------------#----------------------#----------------------
#### Simulação - Prova subestimação dados incompletos ####
# Autor: Carlos A. Zarzar
# Data: 15/07/2020
# Cap_1 (Tese: Artigo_1 revista revisao)
# Data: SECOM 
#----------------------#----------------------#----------------------#----------------------
# 2_Script: Simulação do peso do camarão em um cultivo
# de baixa densidade utilizando dados limitadas a partir de uma 
# curva Micahelis Menten para crescimento (Artigo: 
# A generalized Michaelis-Menten equation for the analysis of growth
# López et al. 2000). 
# E o segundo objetivo  do script é
# modelar outras curvas a partir desses dados simulados.
# Michaelis-Menten equation
# w(t)= (w0*beta^kappa + w1*x^kappa)/(beta^kappa + x^kappa) ::: MM growth curve
#----------------------#----------------------#----------------------#----------------------
rm(list = ls())
#================
# Simulando dados
#================
set.seed(135)
## Parametros
w0 = 0.2
alpha = 90.58
kappa = 1.2 # 1.277
beta = 47.7 # 41.24
# Funcao Micahelis Menten
mich <- function(x,w0=0.2,alpha=90,kappa,beta){
  y <-  (w0*beta^kappa + alpha*x^kappa)/(beta^kappa + x^kappa)   
  return(y)
}
x <- seq(0,100,1)
y <- mich(x,w0,alpha,kappa,beta)
# summary(y)
# Selecionando dados limitados ou incompletos
df <- data.frame(time = x, peso = mich(x,w0,alpha,kappa,beta)+rnorm(n = length(x),sd=0.08*y,)) # estudar outros sigma 0.25
#----------------------#----------------------#----------------------#----------------------

# Plotando pontos originais

# png("Simulação.png", units="px", width=815, height=700, res=100)

par(cex=0.7, mai=c(0.05,0.07,0.07,0.20))
par(fig=c(0.06,0.52,0.70,1))

library(RColorBrewer)
my_colors = brewer.pal(8, "Dark2") 
# Logistico equation
imc <- 7
df_lim <- df[which(df$time<=imc),]
n1 <- nls(formula= peso ~ A/(1+exp(K*(B-time))), data=df_lim,
          start=list(A=50,B=9,K=0.3), trace=FALSE)
plot(peso~time, data=df, xaxt='n',cex.axis=1.4,pch=1, xlim=c(0,105))
points(df_lim$time,df_lim$peso, col=2, pch=19)
A=coef(n1)[1]; B=coef(n1)[2]; K=coef(n1)[3]
curve( A/(1+exp(K*(B-x))), add=TRUE, col=my_colors[1], lty=26, lwd=3.5, type="l")
# Gompertz
n2 <- nls(formula= peso ~ A*exp(-exp(K*(time-B))), data=df_lim,
          start=list(A=10,B=4,K=0.17), trace=FALSE)
A=coef(n2)[1]; B=coef(n2)[2]; K=coef(n2)[3]
curve( A*exp(-exp(K*(x-B))), add=TRUE, col=my_colors[7], lty=2,lwd=3.5)
# Original
curve(mich(x,w0,alpha,kappa,beta),0,max(x), add=TRUE, lwd=2.5)
# von Bertalanffy
n3 <- nls(formula= peso ~ A*(1-exp(-K*(time-B)))^3, data=df_lim,
          start=list(A=43,B=1.8,K=0.21), trace=FALSE)
A=coef(n3)[1]; B=coef(n3)[2]; K=coef(n3)[3]
curve( A*(1-exp(-K*(x-K)))^3, add=TRUE, col=my_colors[3], lty=2,lwd=3.5)
# Richards
n6 <- nls(formula= peso ~ A*(1+(D-1)*exp(-K*(time-B)))^(1/(1-D)), data=df_lim,
          start=list(A=40,D=0.3,K=0.05,B=6.6), trace=FALSE)
A=coef(n6)[1]; D=coef(n6)[2]; K=coef(n6)[3]; B=coef(n6)[4]
curve( A*(1+(D-1)*exp(-K*(x-B)))^(1/(1-D)), add=TRUE, col=my_colors[4], lty=2,lwd=3.5)
# Weibull
n4 <- nls(formula= peso ~ A*(1-exp(-B*(time^D))), data=df_lim,
          start=list(A=50,B=.021,D=1.48), trace=FALSE)
A=coef(n4)[1]; B=coef(n4)[2]; D=coef(n4)[3]
curve( A*(1-exp(-B*(x^D))), add=TRUE, col=my_colors[5], lty=2,lwd=3.5)
# Morgan Mercer Flodin
n5 <- nls(formula= peso ~ A-((A-B)/(1+(K*time)^D)), data=df_lim,
          start=list(A=45,B=0.2,K=0.07,D=2.04), trace=FALSE)
A=coef(n5)[1]; B=coef(n5)[2]; K= coef(n5)[3] ; D=coef(n5)[4]
curve( A-((A-B)/(1+(K*x)^D)), add=TRUE, col=my_colors[6], lty=2,lwd=3.5)
# Michaelis-Menten equation growth curve
n0 <- nls(formula=peso ~ (0.2*B^K + w*time^K)/(B^K + time^K), data=df_lim,
          start=list(w=40,B=16.7,K=1.4), trace=FALSE)
w=coef(n0)[1]; B=coef(n0)[2]; K=coef(n0)[3]
curve( (0.2*x^K + w*x^K)/(B^K + x^K)+0.2, add=TRUE, col=my_colors[2], lty=2, lwd=3.5)
points(df_lim$time,df_lim$peso, col=2, pch=19)
# Borda semanas
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=0)
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=rgb(0.1, 0, 0, 0.5))
text(x = 105,y = 40,labels = "7 weeks",cex=1.5,srt=90)
abline(v = 7,col="gray",lty=2)
#----------------------#----------------------#----------------------#----------------------
par(fig=c(0.06,0.52,0.39,0.70), new=TRUE)
# Logistico equation
imc <- 13
df_lim <- df[which(df$time<=imc),]
n1 <- nls(formula= peso ~ A/(1+exp(K*(B-time))), data=df_lim,
          start=list(A=50,B=9,K=0.3), trace=FALSE)
plot(peso~time, data=df, xaxt='n',cex.axis=1.4,pch=1, xlim=c(0,105))
points(df_lim$time,df_lim$peso, col=2, pch=19)
A=coef(n1)[1]; B=coef(n1)[2]; K=coef(n1)[3]
curve( A/(1+exp(K*(B-x))), add=TRUE, col=my_colors[1], lty=26, lwd=3.5, type="l")
# Gompertz
n2 <- nls(formula= peso ~ A*exp(-exp(K*(time-B))), data=df_lim,
          start=list(A=40,B=6,K=0.17), trace=FALSE)
A=coef(n2)[1]; B=coef(n2)[2]; K=coef(n2)[3]
curve( A*exp(-exp(K*(x-B))), add=TRUE, col=my_colors[7], lty=2,lwd=3.5)
# Original
curve(mich(x,w0,alpha,kappa,beta),0,max(x), add=TRUE, lwd=2.5)
# von Bertalanffy
n3 <- nls(formula= peso ~ A*(1-exp(-K*(time-B)))^3, data=df_lim,
          start=list(A=43,B=1.8,K=0.21), trace=FALSE)
A=coef(n3)[1]; B=coef(n3)[2]; K=coef(n3)[3]
curve( A*(1-exp(-K*(x-K)))^3, add=TRUE, col=my_colors[3], lty=2,lwd=3.5)
# Richards
n6 <- nls(formula= peso ~ A*(1+(D-1)*exp(-K*(time-B)))^(1/(1-D)), data=df_lim,
          start=list(A=40,D=0.3,K=0.05,B=6.6), trace=FALSE)
A=coef(n6)[1]; D=coef(n6)[2]; K=coef(n6)[3]; B=coef(n6)[4]
curve( A*(1+(D-1)*exp(-K*(x-B)))^(1/(1-D)), add=TRUE, col=my_colors[4], lty=2,lwd=3.5)
# Weibull
n4 <- nls(formula= peso ~ A*(1-exp(-B*(time^D))), data=df_lim,
          start=list(A=50,B=.021,D=1.48), trace=FALSE)
A=coef(n4)[1]; B=coef(n4)[2]; D=coef(n4)[3]
curve( A*(1-exp(-B*(x^D))), add=TRUE, col=my_colors[5], lty=2,lwd=3.5)
# Morgan Mercer Flodin
n5 <- nls(formula= peso ~ A-((A-B)/(1+(K*time)^D)), data=df_lim,
          start=list(A=45,B=0.2,K=0.07,D=2.04), trace=FALSE)
A=coef(n5)[1]; B=coef(n5)[2]; K= coef(n5)[3] ; D=coef(n5)[4]
curve( A-((A-B)/(1+(K*x)^D)), add=TRUE, col=my_colors[6], lty=2,lwd=3.5)
# Michaelis-Menten equation growth curve
n0 <- nls(formula=peso ~ (0.2*B^K + w*time^K)/(B^K + time^K), data=df_lim,
          start=list(w=40,B=16.7,K=1.4), trace=FALSE)
w=coef(n0)[1]; B=coef(n0)[2]; K=coef(n0)[3]
curve( (0.2*x^K + w*x^K)/(B^K + x^K)+0.2, add=TRUE, col=my_colors[2], lty=2, lwd=3.5)

points(df_lim$time,df_lim$peso, col=2, pch=19)

# Borda semanas
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=0)
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=rgb(0.1, 0, 0, 0.5))
text(x = 105,y = 40,labels = "13 weeks",cex=1.5,srt=90)
mtext(text="Shrimp weight (g)",side=2,line=2.5,at=40,
      cex=1.3)
abline(v = 13,col="gray",lty=2)

#----------------------#----------------------#----------------------#----------------------
par(fig=c(0.06,0.52,0.08,0.39), new=TRUE)
# Logistico equation
imc <- 18
df_lim <- df[which(df$time<=imc),]
n1 <- nls(formula= peso ~ A/(1+exp(K*(B-time))), data=df_lim,
          start=list(A=60,B=9,K=0.3), trace=FALSE)
plot(peso~time, data=df,xaxt="n", cex.axis=1.4,pch=1, xlim=c(0,105))
axis(1,cex.axis=1.4)
points(df_lim$time,df_lim$peso, col=2, pch=19)
A=coef(n1)[1]; B=coef(n1)[2]; K=coef(n1)[3]
curve( A/(1+exp(K*(B-x))), add=TRUE, col=my_colors[1], lty=26, lwd=3.5, type="l")
mtext(text="Weeks after hatching",side=1,line=2.5,at=100,
      cex=1.3)
# Gompertz
n2 <- nls(formula= peso ~ A*exp(-exp(K*(time-B))), data=df_lim,
          start=list(A=70,B=18,K=0.08), trace=FALSE)
A=coef(n2)[1]; B=coef(n2)[2]; K=coef(n2)[3]
curve( A*exp(-exp(K*(x-B))), add=TRUE, col=my_colors[7], lty=2,lwd=3.5)
# Original
curve(mich(x,w0,alpha,kappa,beta),0,max(x), add=TRUE, lwd=2.5)
# von Bertalanffy
n3 <- nls(formula= peso ~ A*(1-exp(-K*(time-B)))^3, data=df_lim,
          start=list(A=73,B=5,K=0.05), trace=FALSE)
A=coef(n3)[1]; B=coef(n3)[2]; K=coef(n3)[3]
curve( A*(1-exp(-K*(x-K)))^3, add=TRUE, col=my_colors[3], lty=2,lwd=3.5)
# Richards
n6 <- nls(formula= peso ~ A*(1+(D-1)*exp(-K*(time-B)))^(1/(1-D)), data=df_lim,
          start=list(A=75,D=0.4,K=0.04,B=12.5), trace=FALSE)
A=coef(n6)[1]; D=coef(n6)[2]; K=coef(n6)[3]; B=coef(n6)[4]
curve( A*(1+(D-1)*exp(-K*(x-B)))^(1/(1-D)), add=TRUE, col=my_colors[4], lty=2,lwd=3.5)
# Weibull
n4 <- nls(formula= peso ~ A*(1-exp(-B*(time^D))), data=df_lim,
          start=list(A=66,B=.02,D=1.07), trace=FALSE)
A=coef(n4)[1]; B=coef(n4)[2]; D=coef(n4)[3]
curve( A*(1-exp(-B*(x^D))), add=TRUE, col=my_colors[5], lty=2,lwd=3.5)
# Morgan Mercer Flodin
n5 <- nls(formula= peso ~ A-((A-B)/(1+(K*time)^D)), data=df_lim,
          start=list(A=78,B=.21,K=0.03,D=1.45), trace=FALSE)
A=coef(n5)[1]; B=coef(n5)[2]; K= coef(n5)[3] ; D=coef(n5)[4]
curve( A-((A-B)/(1+(K*x)^D)), add=TRUE, col=my_colors[6], lty=2,lwd=3.5)
# Michaelis-Menten equation growth curve
n0 <- nls(formula=peso ~ (0.2*B^K + w*time^K)/(B^K + time^K), data=df_lim,
          start=list(w=80,B=40,K=1.2), trace=FALSE)
w=coef(n0)[1]; B=coef(n0)[2]; K=coef(n0)[3]
curve( (0.2*x^K + w*x^K)/(B^K + x^K)+0.2, add=TRUE, col=my_colors[2], lty=2, lwd=3.5)

points(df_lim$time,df_lim$peso, col=2, pch=19)

# Borda semanas
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=0)
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=rgb(0.1, 0, 0, 0.5))
text(x = 105,y = 40,labels = "18 weeks",cex=1.5,srt=90)

abline(v = 18,col="gray",lty=2)

#----------------------#----------------------#----------------------#----------------------
par(fig=c(0.06+0.46,0.98,0.08,0.39), new=TRUE)

# Logistico equation
imc <- 82
df_lim <- df[which(df$time<=imc),]
n1 <- nls(formula= peso ~ A/(1+exp(K*(B-time))), data=df_lim,
          start=list(A=50,B=9,K=0.3), trace=FALSE)
plot(peso~time, data=df, cex.axis=1,cex.lab=1.5,
     xlab="",yaxt='n',xaxt='n',pch=1, xlim=c(0,105))
axis(1,cex.axis=1.4)
points(df_lim$time,df_lim$peso, col=2, pch=19)
A=coef(n1)[1]; B=coef(n1)[2]; K=coef(n1)[3]
curve( A/(1+exp(K*(B-x))), add=TRUE, col=my_colors[1], lty=26, lwd=3.5, type="l")
# Gompertz
n2 <- nls(formula= peso ~ A*exp(-exp(K*(time-B))), data=df_lim,
          start=list(A=48,B=11,K=0.15), trace=FALSE)
A=coef(n2)[1]; B=coef(n2)[2]; K=coef(n2)[3]
curve( A*exp(-exp(K*(x-B))), add=TRUE, col=my_colors[7], lty=2,lwd=3.5)
# Original
curve(mich(x,w0,alpha,kappa,beta),0,max(x), add=TRUE, lwd=2.5)
# von Bertalanffy
n3 <- nls(formula= peso ~ A*(1-exp(-K*(time-B)))^3, data=df_lim,
          start=list(A=43,B=1.8,K=0.21), trace=FALSE)
A=coef(n3)[1]; B=coef(n3)[2]; K=coef(n3)[3]
curve( A*(1-exp(-K*(x-K)))^3, add=TRUE, col=my_colors[3], lty=2,lwd=3.5)
# Weibull
n4 <- nls(formula= peso ~ A*(1-exp(-B*(time^D))), data=df_lim,
          start=list(A=50,B=.021,D=1.48), trace=FALSE)
A=coef(n4)[1]; B=coef(n4)[2]; D=coef(n4)[3]
curve( A*(1-exp(-B*(x^D))), add=TRUE, col=my_colors[5], lty=2,lwd=3.5)
# Morgan Mercer Flodin
n5 <- nls(formula= peso ~ A-((A-B)/(1+(K*time)^D)), data=df_lim,
          start=list(A=45,B=0.2,K=0.07,D=2.04), trace=FALSE)
A=coef(n5)[1]; B=coef(n5)[2]; K= coef(n5)[3] ; D=coef(n5)[4]
curve( A-((A-B)/(1+(K*x)^D)), add=TRUE, col=my_colors[6], lty=2,lwd=3.5)
# Michaelis-Menten equation growth curve
n0 <- nls(formula=peso ~ (0.2*B^K + w*time^K)/(B^K + time^K), data=df_lim,
          start=list(w=60,B=16.7,K=1.4), trace=FALSE)
w=coef(n0)[1]; B=coef(n0)[2]; K=coef(n0)[3]
curve( (0.2*x^K + w*x^K)/(B^K + x^K)+0.2, add=TRUE, col=my_colors[2], lty=2, lwd=3.5)

points(df_lim$time,df_lim$peso, col=2, pch=19)

# Borda semanas
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=0)
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=rgb(0.1, 0, 0, 0.5))
text(x = 105,y = 40,labels = "82 weeks",cex=1.5,srt=90)

abline(v = 82,col="gray",lty=2)


#----------------------#----------------------#----------------------#----------------------
par(fig=c(0.06+0.46,0.98,0.39,0.70), new=TRUE)
# Logistico equation
imc <- 36
df_lim <- df[which(df$time<=imc),]
n1 <- nls(formula= peso ~ A/(1+exp(K*(B-time))), data=df_lim,
          start=list(A=50,B=9,K=0.3), trace=FALSE)
plot(peso~time, data=df,yaxt='n', xaxt='n',cex.axis=1,pch=1, xlim=c(0,105))
points(df_lim$time,df_lim$peso, col=2, pch=19)
# 1-deviance(n1)/deviance(lm(peso~1, df_lim))
A=coef(n1)[1]; B=coef(n1)[2]; K=coef(n1)[3]
curve( A/(1+exp(K*(B-x))), add=TRUE, col=my_colors[1], lty=26, lwd=3.5, type="l")
# Gompertz
n2 <- nls(formula= peso ~ A*exp(-exp(K*(time-B))), data=df_lim,
          start=list(A=48,B=11,K=0.15), trace=FALSE)
A=coef(n2)[1]; B=coef(n2)[2]; K=coef(n2)[3]
curve( A*exp(-exp(K*(x-B))), add=TRUE, col=my_colors[7], lty=2,lwd=3.5)
# Original
curve(mich(x,w0,alpha,kappa,beta),0,max(x), add=TRUE, lwd=2.5)
# von Bertalanffy
n3 <- nls(formula= peso ~ A*(1-exp(-K*(time-B)))^3, data=df_lim,
          start=list(A=43,B=1.8,K=0.21), trace=FALSE)
A=coef(n3)[1]; B=coef(n3)[2]; K=coef(n3)[3]
curve( A*(1-exp(-K*(x-K)))^3, add=TRUE, col=my_colors[3], lty=2,lwd=3.5)
# Weibull
n4 <- nls(formula= peso ~ A*(1-exp(-B*(time^D))), data=df_lim,
          start=list(A=50,B=.021,D=1.48), trace=FALSE)
A=coef(n4)[1]; B=coef(n4)[2]; D=coef(n4)[3]
curve( A*(1-exp(-B*(x^D))), add=TRUE, col=my_colors[5], lty=2,lwd=3.5)
# Morgan Mercer Flodin
n5 <- nls(formula= peso ~ A-((A-B)/(1+(K*time)^D)), data=df_lim,
          start=list(A=80,B=0.2,K=0.07,D=2.04), trace=FALSE)
A=coef(n5)[1]; B=coef(n5)[2]; K= coef(n5)[3] ; D=coef(n5)[4]
curve( A-((A-B)/(1+(K*x)^D)), add=TRUE, col=my_colors[6], lty=2,lwd=3.5)
# Michaelis-Menten equation growth curve
n0 <- nls(formula=peso ~ (0.2*B^K + w*time^K)/(B^K + time^K), data=df_lim,
          start=list(w=40,B=16.7,K=1.4), trace=FALSE)
w=coef(n0)[1]; B=coef(n0)[2]; K=coef(n0)[3]
curve( (0.2*x^K + w*x^K)/(B^K + x^K)+0.2, add=TRUE, col=my_colors[2], lty=2, lwd=3.5)

points(df_lim$time,df_lim$peso, col=2, pch=19)

# Borda semanas
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=0)
rect(xleft = 102,ybottom =  -5, 
     xright =110,ytop = 80,
     col=rgb(0.1, 0, 0, 0.5))
text(x = 105,y = 40,labels = "36 weeks",cex=1.5,srt=90)

abline(v = 36,col="gray",lty=2)
# Isso tem que está para legendas funcionarem
mtext(text="",side=4,line=0.5,at=35,cex=1.3)
#----------------------#----------------------#----------------------#----------------------
par(fig=c(0.06+0.46,0.98,0.70,0.99), new=TRUE)

legend(x=-5 , y=70 , c("Original curve"),  lty=rep(1,1),
       inset=c(0,1.1), xpd=FALSE, horiz=TRUE, bty="n",col=c("black"),lwd=4, cex=1.5)
legend(x=55 , y=70 , c("Michaelis-Menten"),  lty=rep(1,1),
       inset=c(0,1.1), xpd=FALSE, horiz=TRUE, bty="n",col=c(my_colors[c(2)]),lwd=4, cex=1.5)

legend(x=-5 , y=60, c("Morgan.M.Flodin"),  lty=rep(1,1),
       inset=c(0,1.1), xpd=FALSE, horiz=TRUE, bty="n",col=c(my_colors[c(6)]),lwd=4, cex=1.5)
legend(x=55 , y=60, c("Richard"),  lty=rep(1,1),
       inset=c(0,1.1), xpd=FALSE, horiz=TRUE, bty="n",col=c(my_colors[c(4)]),lwd=4, cex=1.5)

legend(x=-5 , y=50, c("Weibull"),  lty=rep(1,1),
       inset=c(0,1), xpd=FALSE, horiz=TRUE, bty="n",col=my_colors[c(5)],lwd=4, cex=1.5)
legend(x=55 , y=50, c("von Bertalanffy"),  lty=rep(1,1),
       inset=c(0,1), xpd=FALSE, horiz=TRUE, bty="n",col=my_colors[c(3)],lwd=4, cex=1.5)

legend(x=-5 , y=40, c("Gompertz"),  lty=rep(1,1),
       inset=c(0,1), xpd=FALSE, horiz=TRUE, bty="n",col=my_colors[c(1)],lwd=4, cex=1.5)
legend(x=55 , y=40, c("Logistc"),  lty=rep(1,1),
       inset=c(0,1), xpd=FALSE, horiz=TRUE, bty="n",col=my_colors[c(7)],lwd=4, cex=1.5)

legend(x=25 , y=30, c("Complete data", "Limited data"), pch = c(1,19),
       inset=c(0,1), xpd=FALSE, horiz=FALSE, col=c(1,2),cex=1.5, bty="n")

Bias Correction in Parameter Estimation

To correct the parameters’ underestimation in non-linear growth models, Zarzar and collaborators (2022) proposed a Hierarchical Bayesian Growth Model (STAN language with brms interface R package) that takes into account information from wild animals (from fishing data) as a priori information. Although the farming environment is completely different from the wild environment, information from the same species sourced from fishing data aids in more accurate parameter estimation for aquaculture growth models. By incorporating these insights into the prior distribution of the Bayesian model, we correct potential parameter underestimations caused by incomplete or limited data.

Below I provide an R script with Bayesian Hierarchical Model (package::brms) adjustment for growth data for Pacific white shrimp.The limited/incomplete data were simulated using the von Bertalanffy equation. They represent the weight of the shrimp over the time of a production cycle in a pond (tank) on a farm in northeastern Brazil. Data were limited to weights less than 16 grams and a nonlinear Bayesian model (von Bertalanffy) was fitted. Based on the known parameters configured in the simulator, we define the model priors for bias correction. The posterior predictive density of the model was calculated and shown as a result in the plot below.

# Simulated data from von Bertalanffy function
rm(list = ls())
#================
# Simulando dados
#================
set.seed(135)
## Parametros model
alpha = 85
kappa = 0.013
beta = 1.6 
# Funcao growth
growth <- function(x,alpha,kappa,beta){
  y <- alpha * (1 - exp(-kappa*(x+beta)))^3 # von Bertalanffy
  return(y)
}
x <- seq(0,350,10) # time (e.g. weeks, day)
y <- growth(x,alpha,kappa,beta) # weight (grams)
# summary(y)
# Selecionando dados limitados ou incompletos
df <- data.frame(time = x, peso = y+rnorm(n = length(x),sd=0.08*y,)) # estudar outros sigma 0.25
df$limite <- ifelse(df$peso < 16, TRUE, FALSE)

## Observação limitada
# plot(df$time,df$peso)
# points(df$time[df$peso < 16], df$peso[df$peso < 16], col = "red", pch = 19)
# points(df$time[df$peso >= 16], df$peso[df$peso >= 16], col = "black", pch = 19)

#----------------------#----------------------#----------------------#----------------------

#----------------------------------
suppressMessages(suppressWarnings(library("brms")))
# Initial values
init <- list (
  alpha = 85,
  kappa = 0.013,
  beta = 1.6 
)

# Initial values
init <- function(chain_id=1) {
  list ( alpha  =  rep(85,n_cultivo) ,
         kappa  =  rep(0.013,n_cultivo) ,
         beta  =  rep(1.6,n_cultivo))
}
n_cultivo = 1

#----------------------------------
# NUTS configuration
control = list(adapt_delta = 0.9)
#----------------------------------
#### modelos ####
# Piors
prior = c(
  set_prior("normal(85,5)", nlpar = "alpha"),
  set_prior("normal(0.013,0.001)", nlpar = "kappa",lb = 0, ub = 0.5),
  set_prior("normal(1.6, 0.1)", nlpar = "beta")
)
# prior. <- c(
#   # Population mean
#   set_prior("normal(0.2,0.01)", nlpar = "w0", lb = 0 , ub=0.5), # normal(100,5) / lognormal(4.37, 0.31) / normal(83.27,13.39)
#   set_prior("normal(90,5)", nlpar = "alpha", lb = 0 , ub=120), # normal(100,5) / lognormal(4.37, 0.31) / normal(83.27,13.39)
#   set_prior("normal(47,2)", nlpar = "beta", lb = 0, ub = 100),
#   set_prior("normal(1.2,0.05)", nlpar = "kappa", lb = 1.01, ub = 5) # skew_normal(1.89,0.15,-2.0)
# )
#----------------------------------
# Formula  bf() = brmsformula()
formula <- bf(
  # peso ~ (w0*beta^kappa + alpha*time^kappa)/(beta^kappa + time^kappa), # Funcao Micahelis Menten
  peso ~ alpha * (1 - exp(-kappa*(time+beta)))^3, # von Bertalanffy
  alpha + kappa + beta + sigma ~ 1,
  nl = TRUE)
#----------------------------------
# fitting a model (Mesmo modelo com priores diferentes e menas observações)
fit <- brm(
  formula,
  family=gaussian(), 
  data = df[which(df$limite == TRUE),], # Dados limitados
  prior = prior,
  control = control,
  init=init,
  # chains = 4,
  # thin = 2,
  # iter = (1000*4), # (2500*4),
  save_model = "Saves/Michaelis_model",
  file = "Saves/fit2"
  # save_all_pars = T, 
  # seed = 1622,
  # cores = parallel::detectCores() # install.packages(parallel)
)
# Plot
# plot(conditional_effects(fit), points = TRUE)
#---------------------------------------
### Predição do modelo ####
# dados para predição
# data freme de predicao
newdf <- df
# Posteriori preditiva do modelo para os dados (ciclo ou cultivo) não visto
ppdf2 <- posterior_predict(fit, newdata = newdf, 
                           re_formula = NULL,
                           allow_new_levels=TRUE)
# head(ppdf2)
# dim(ppdf2)
# Funcoes para construção do data frame
stderror <- function(x) sd(x)/sqrt(length(x))
Q2.5 <- function(x) quantile(x, prob=0.025)
Q97.5 <- function(x) quantile(x, prob=0.975)
# Data frame das estimativas preditivas da posteriori
ppdf2 <- cbind(newdf,
               data.frame(
                 Estimate = apply(ppdf2, 2, median),
                 Est.Error = apply(ppdf2, 2, stderror),
                 Q2.5 = apply(ppdf2, 2, Q2.5),
                 Q97.5 = apply(ppdf2, 2, Q97.5)
               )
)
newdf <- ppdf2
### Gráfico
library(ggplot2)
ggplot(newdf, aes(time, Estimate)) +
  geom_smooth(data=newdf,aes(ymin = Q2.5, ymax = Q97.5),
              stat = "identity") +
  # facet_wrap(vars(cultivo))+
  geom_point(data=df,aes(time, peso))+
  geom_point(data=df[which(df$limite == TRUE),],aes(time, peso),col="red")+
  ylab("Estimates weight (grams)")+xlab("Days")

Hierarchical Bayesian Modeling

Figura 1. Pacific White Shrimp (Litopenaeus vannamei) grown in ponds (excavated tanks) from a marine shrimp farm.

Aquaculture data naturally exhibit a hierarchical structure, warranting consideration for our proposed model. Within a farm, the data are hierarchically structured across the production cycles, ponds (tanks), and, across the farm level. By delineating data hierarchically, we can analyze growth curves per production cycle, per pond (tank), and if it is in the interest of management for decision-making, at the farm level.

Robust models capturing growth curves at multilevel designed enable inferential benefits through dependencies between these events, showcasing the statistical strength of hierarchical models. What I’m trying to say is that somehow the information from a specific pond (tank) can contribute to estimating the growth curve of another pond (after all, they are ponds from the same farm). Just as information on the production cycles of the same pond (tank) can contribute to predicting future cycles of that same pond. Therefore, instead of considering models for each level independently, we can bring together all the information in a multilevel model that is more robust and perhaps more accurate in making predictions beyond what was observed in the data set.

Below we have a script in R language that simulates data from a shrimp farm, capturing this hierarchical structure of aquaculture data. We are simulating 2 ponds (tanks) with 3 production cycles each (per year) of a shrimp farm, whose data correspond to weekly biometrics (weight in grams) throughout the cultivation (over time), based on the Morgan growth function -Mercer-Flodin.

##### Gerando os dados para o código reprodutivo #####

# Data: 17/06/20022

#----------------------#----------------------#----------------------

# Simulando dados:

rm(list = ls())
#================
# Simulating data
#================
set.seed(1345)
## Parametros
w0 = 0
alpha = 90 # 90.58
kappa = 0.043 # 0.04
delta = 1.8 # 1.67
# Heterocedasticidade
tau = 0.055

w0 = 0
n_cycle=3
n_pond=2
# List of parameters for each production cycle and tank
list_df <- list(pond_1=data.frame(w0 = rep(w0,n_cycle),
                                  alpha = rnorm(n_cycle,alpha,15),
                                  kappa = rnorm(n_cycle,kappa,0.01),
                                  delta = rnorm(n_cycle,delta,.2),
                                  tau = rnorm(n_cycle,tau,0.01)),
                pond_2=data.frame(w0 = rep(w0,n_cycle),
                                  alpha = rnorm(n_cycle,alpha-10,20),
                                  kappa = rnorm(n_cycle,kappa-0.015,0.01),
                                  delta = rnorm(n_cycle,delta-0.2,0.2),
                                  tau = rnorm(n_cycle,tau-0.015,0.01))
)
# list_df
# Mean and standard deviation of parameters for each tank
# lapply(list_df,function(x){apply(x,2,mean)})
# lapply(list_df,function(x){apply(x,2,sd)})

t <- seq(1,100,1) # times (weeks)
# MMF function
df <- lapply(list_df,
             function(x){apply(x, 1, function(x){
               y <- x[2] - ((x[2])/(1+(x[3]*t)^x[4]))
               y <- y+rnorm(length(t),
                            0,
                            x[5]*t*(x[2]*x[4]*x[3]*(x[3]*t)^(x[4]-1) )/( ((x[3]*t)^x[4])^2+2*(x[3]*t)^x[4]+1) ) # random error
             })}
)
# Data frame biometric
bio <- data.frame(time = rep(1:length(t),n_cycle*n_pond),
                  peso = unlist(df, use.names=FALSE),
                  tq = rep(1:n_pond,each=length(t)*n_cycle) ,
                  cultivo = rep(1:(n_cycle*n_pond),each=length(t)))
# str(bio)
bio$pond_factor <- as.factor(bio$tq)
bio$cycle_factor <- as.factor(bio$cultivo)
# Graphics
library(ggplot2)
library(gridExtra)
# Farm level
g1 <- ggplot(bio,aes(x=time,y=peso))+
  geom_point()+
  geom_smooth(method = "loess")
# Pond (tank) level
g2 <-  ggplot(bio, aes(x = time, y = peso, color = pond_factor)) +
  geom_point() +
  geom_smooth(method = "loess") +
  theme(legend.position = "bottom",       # Posição da legenda
        legend.direction = "horizontal",  # Orientação da legenda
        legend.box = "horizontal",        # Estilo da caixa da legenda
        legend.key.width = unit(0.5, "cm"))  # Largura da caixa da legenda
# cycle production level
g3 <-  ggplot(bio, aes(x = time, y = peso, color = cycle_factor)) +
  geom_point() +
  geom_smooth(method = "loess") +
  theme(legend.position = "bottom",       # Posição da legenda
        legend.direction = "horizontal",  # Orientação da legenda
        legend.box = "horizontal",        # Estilo da caixa da legenda
        legend.key.width = unit(0.5, "cm"))  # Largura da caixa da legenda

suppressMessages(suppressWarnings(grid.arrange(g1,g2,g3,ncol=3)))

#-----------------------#-----------------------#-----------------------
# bio <- bio[which(bio$peso <= 30),c(1:4)]
# save(bio,file="bio.RData")

Below we have a script in R language that simulates data from a Weibull growth model assuming the hierarchical structure of aquaculture data. We are simulating 2 ponds (tanks) with 3 production cycles each (per year) of a shrimp farm using the Weibull growth function so that the reader can have different experiences. We then fit the Weibull growth model (Hierarchical Bayesian nonlinear model) to the simulated data. The objective is to capture the parameters configured in the simulation.

Data simulation

# Simulando dados para Modelo Hierárquico Weibull
# Weibull de crescimento
# Dados simulado para estudo de prioris

# Data 18/05/2024
# Autor: Carlos A. Zarzar
# 
#----------------------#----------------------#----------------------#------# Preamble
suppressMessages(library(sn))
library(truncnorm)
library(ggplot2)
# rm(list = ls())
set.seed(1054)
# Simulando valores de peso para modelo multinível
n_cycle=3 # 3
n_pond=2 # 2

n = n_cycle*n_pond # Número de simulação
## Alpha
a_p <- rnorm(n,85,5) # 83.27/13.39

## w0 (omega)
w_p <- truncnorm::rtruncnorm(n,0,0.2) # População
w_c <- truncnorm::rtruncnorm(n,0,0.002) # Grupo - Cultivo

## Beta
# b_p <- rnorm(n,0.060,.001) # População
b_p <- sn::rsn(n=n,xi=1.3,omega=3,alpha=6) # População
b_tq <- rnorm(n,0,.0005) # Grupo - tanque tq
b_tq_c <- rnorm(n,0,.0005) # Grupo - tq:cultivo

## Delta
d_p <- rnorm(n,1.5,0.05) # População
# summary(d_p)
d_tq <- rnorm(n,0,0.008) # Grupo - tanque tq
d_tq_c <- rnorm(n,0,.0005) # Grupo - tq:cultivo

## Juntando tudo em um data frame
df_comb <- data.frame(
  alpha = a_p,
  delta = d_p + d_tq + d_tq_c,
  beta = b_p + b_tq + b_tq_c,
  omega = 0.2 # w_p # + w_c
)
# summary(df_comb)
df_comb <- list(df_comb)
## Weibull growth
t <- seq(0,200,10) # tempo dias
# x[1] - alpha
# x[2] - delta
# x[3] - beta
# x[4] - omega

## Weibull
df2 <- lapply(df_comb,
              function(x){apply(x, 1, function(x){
                y <- x[1]*(1-exp(-(x[3]/10000)*(t^x[2])))+x[4] # Weibull
              })}
)
# df2
# lengths(df2)
# dim(df2[[1]])
# Data freme biometrias ciclo (cultivo)
bio <- data.frame(time = rep(t,dim(df2[[1]])[2]),
                  weight = unlist(df2, use.names=FALSE),
                  cycle = rep(1:dim(df2[[1]])[2],each=length(t)),
                  pond = rep(1:n_pond,each=length(t)))
bio$cycle <- as.factor(bio$cycle)
bio$pond <- as.factor(bio$pond)
# Table
library(knitr)
knitr::kable(head(bio), col.names = gsub("[.]", " ", names(bio)))
time weight cycle pond
0 0.200000 1 1
10 1.105458 1 1
20 2.785034 1 1
30 4.944468 1 1
40 7.459065 1 1
50 10.245662 1 1

Fitting the model

# Modelo Hierárquico Weibull
# Weibull

# Data 18/05/2024
# Autor: Carlos A. Zarzar
#-----------------------#-----------------------#-----------------------
# rm(list = ls())
# Preamble
suppressMessages(suppressWarnings( library("brms") ))
# library(rstan)
# options(mc.cores = parallel::detectCores())
# rstan_options(auto_write = TRUE)

# Initial values
init <- list (
  alpha  =  85,
  beta = 3.20,
  delta  =  1.4,
  sigma  =  0.055
)
inits_drift <- list(init,init,init,init)

# NUTS configuration
control. <- list(
  adapt_engaged = TRUE,
  adapt_delta = 0.90, 
  # stepsize = 0.25, 
  max_treedepth = 15
)

#### modelos ####
# # Piors
#fit_weib
prior. <- c(
  # Population mean
  set_prior("normal(85,5)", nlpar = "alpha", lb = 0 , ub=300), # normal(100,5) / lognormal(4.37, 0.31) / normal(83.27,13.39)
  set_prior("skew_normal(1.3,5,6)", nlpar = "beta", lb = 0, ub = 50),
  set_prior("normal(1.5,0.05)", nlpar = "delta", lb = 1.01, ub = 5), # skew_normal(1.89,0.15,-2.0)
  # Pond sd
  set_prior("normal(0,0.008)", nlpar = "delta",  class="sd", group = "pond" , lb = 0, ub = 1),
  set_prior("normal(0,.0005)", nlpar = "beta",  class="sd", group = "pond", lb = 0, ub = 1),
  # Grupo pond:cycle
  set_prior("normal(0,0.005)", nlpar = "delta",  class="sd", group = "pond:cycle", lb = 0, ub = 1),
  set_prior("normal(0,.0005)", nlpar = "beta",  class="sd", group = "pond:cycle", lb = 0, ub = 1) #,
)

# Formula  bf() = brmsformula()
formula. <- bf(
  # location parameter
  weight ~ alpha*(1-exp(-(beta/10000)*(time^delta)))+0.2, # Weibull
  # Nonlinear variables
  alpha ~ 1, 
  beta + delta ~ (1 | pond/cycle),
  # Nonlinear fit_weib
  nl = TRUE)

# fitting a model (Mesmo modelo com priores diferentes e menas observações)
suppressMessages(suppressWarnings( 
        fit_weib <- brm(
        formula.,
        family=gaussian(), 
        data = bio, # df_test ou bio
        prior = prior.,
        control = control.,
        init=inits_drift,
        chains = 4,
        # thin = 2,
        # iter = (2500*4),
        # save_model = "Script/3_weib/weibull_model",
        file = "Saves/fit_weib",
        # seed = 1622,
        # cores = 8 # getOption("mc.cores", 1)
      )
  ))
# Análsie dos resutlados
summary(fit_weib)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: weight ~ alpha * (1 - exp(-(beta/10000) * (time^delta))) + 0.2 
##          alpha ~ 1
##          beta ~ (1 | pond/cycle)
##          delta ~ (1 | pond/cycle)
##    Data: bio (Number of observations: 126) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Group-Level Effects: 
## ~pond (Number of levels: 2) 
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(beta_Intercept)      0.00      0.00     0.00     0.00 1.00     2577     1679
## sd(delta_Intercept)     0.01      0.00     0.00     0.02 1.00     3012     1848
## 
## ~pond:cycle (Number of levels: 6) 
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(beta_Intercept)      0.00      0.00     0.00     0.00 1.00     2990     2032
## sd(delta_Intercept)     0.05      0.00     0.04     0.05 1.00     2657     2182
## 
## Population-Level Effects: 
##                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## alpha_Intercept    79.41      0.75    78.01    80.92 1.00     3262     2748
## beta_Intercept      2.99      0.18     2.67     3.36 1.00     3186     2568
## delta_Intercept     1.50      0.02     1.45     1.54 1.00     1792     2159
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.67      0.04     0.59     0.76 1.00     4908     2815
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Population level

#---------------------------------------#-------------------------------
# Population level
plot(conditional_effects(fit_weib), points = TRUE)

pp_check(fit_weib,ndraws = 100)

#---------------------------------------#-------------------------------

Cycle production Level

# Cycle Level
set.seed(1007) # Estimativa menores de acurácia
# Dados de teste para o cycle (ciclo)
# unique(bio$cycle) # Tanques a serem avaliados (3 biometrias finais)
####### Gerando o gráfico para o artigo
newdf <- subset(bio,
                cycle %in% unique(bio$cycle),
                c("time","weight","cycle","pond"))
# Posteriori preditiva do modelo para os dados (ciclo ou cycle) não visto
suppressMessages(suppressWarnings(library(rstantools)))
ppdf2 <- posterior_predict(fit_weib, newdata = newdf, 
                           re_formula = NULL,
                           allow_new_levels=TRUE)
# head(ppdf2)
# dim(ppdf2)

# Funcoes para construção do data frame
stderror <- function(x) sd(x)/sqrt(length(x))
Q2.5 <- function(x) quantile(x, prob=0.025)
Q97.5 <- function(x) quantile(x, prob=0.975)

# Data frame das estimativas preditivas da posteriori
ppdf2 <- cbind(newdf,
               data.frame(
                 Estimate = apply(ppdf2, 2, median),
                 Est.Error = apply(ppdf2, 2, stderror),
                 Q2.5 = apply(ppdf2, 2, Q2.5),
                 Q97.5 = apply(ppdf2, 2, Q97.5)
               )
)

# ppdf2
newdf <- ppdf2

ggplot(newdf, aes(time, Estimate)) +
  geom_smooth(data=ppdf2,aes(ymin = Q2.5, ymax = Q97.5),
              stat = "identity") +
  facet_wrap(vars(cycle))+
  geom_point(data=bio[which(bio$cycle %in% unique(newdf$cycle)),],aes(time, weight))+
  geom_point(data=bio[which(bio$cycle %in% unique(newdf$cycle)),],aes(time, weight),col="red")+
  ylab("Estimates weight (grams)")+xlab("Days")

#---------------------------------------#-------------------------------

Pond (tank) level

#---------------------------------------#---------------------------------------
####### Gerando o gráfico para o artigo
newdf <- subset(bio,
                pond %in% unique(bio$pond),
                c("time","weight","pond"))
#---------------------------------------
# Posteriori preditiva do modelo para os dados (ciclo ou cycle) não visto
ppdf2 <- posterior_predict(fit_weib, newdata = newdf, 
                           re_formula = NULL,
                           allow_new_levels=TRUE)
# head(ppdf2)
# dim(ppdf2)

# Funcoes para construção do data frame
stderror <- function(x) sd(x)/sqrt(length(x))
Q2.5 <- function(x) quantile(x, prob=0.025)
Q97.5 <- function(x) quantile(x, prob=0.975)
# Data frame das estimativas preditivas da posteriori
ppdf2 <- cbind(newdf,
               data.frame(
                 Estimate = apply(ppdf2, 2, median),
                 Est.Error = apply(ppdf2, 2, stderror),
                 Q2.5 = apply(ppdf2, 2, Q2.5),
                 Q97.5 = apply(ppdf2, 2, Q97.5)
               )
)

# ppdf2
newdf <- ppdf2
# head(newdf)

ggplot(newdf, aes(time, Estimate)) +
  geom_smooth(data=newdf,aes(ymin = Q2.5, ymax = Q97.5),
              stat = "identity") + 
  facet_wrap(vars(pond)) +
  geom_point(data=bio[which(bio$pond %in% unique(bio$pond)),],aes(time, weight),col="red")+
  ylab("Estimates weight (grams)")+xlab("Days")

Conclusion

In conclusion, our exploration into nonlinear growth modeling and Bayesian hierarchical approaches using STAN sheds light on crucial aspects of aquaculture growth analysis. By acknowledging the challenges posed by incomplete or limited data and addressing parameter underestimation biases, we pave the way for more accurate and informed decision-making in aquaculture management. Through the integration of advanced statistical methodologies, such as hierarchical modeling and Bayesian inference, we can unlock deeper insights into growth dynamics, facilitating sustainability.

I hope I have contributed some information to your personal knowledge. Below you will find good references to delve deeper into all the modeling covered here.

References

Zarzar, C. A., Silva, E. M., Fernandes, T. J., & De Oliveira, I. R. C. (2022). Evidence of parameters underestimation from nonlinear growth models for data classified as limited. Computers and Electronics in Agriculture, 200, 107196.

Zarzar, C. A., Fernandes, T. J., & de Oliveira, I. R. C. (2023). Modeling the growth of Pacific white shrimp (Litopenaeus vannamei) using the new Bayesian hierarchical approach based on correcting bias caused by incomplete or limited data. Ecological Informatics, 77, 102271.

Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and computing, 24, 997-1016.

Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., & Gelman, A. (2019). Visualization in Bayesian workflow. Journal of the Royal Statistical Society Series A: Statistics in Society, 182(2), 389-402.

Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., … & Modrák, M. (2020). Bayesian workflow. arXiv preprint arXiv:2011.01808.

Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors.

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and computing, 27, 1413-1432.

Sivula, T., Magnusson, M., Matamoros, A. A., & Vehtari, A. (2020). Uncertainty in Bayesian leave-one-out cross-validation based model comparison. arXiv preprint arXiv:2008.10296.