Original data: \(Y_{ij}\) is a random variable associated with the PCB concentrations of fish \(j\) with age \(i\) for \(i = 1,\ldots, 12, \quad j = 1,2,3\). Assume that \(Y_{ij}\) follows the true model \[\begin{equation} \label{eq:true_model} Y_{ij} =g(i,\beta) + \sigma_i \epsilon_{ij},\quad g(x,\beta) = \beta_0x^{\beta_1},\quad \sigma _i = 0.2g(i,\beta) ^{1.25} \end{equation}\]
g <- function (i, beta){
beta[1] * i ^ beta[2]
}
sig <- function(i, beta, theta ){
.2 * g(i, beta)^ theta
}
simdf <- function(beta, theta, ngroup , nobs){
simdf <- rep(sig(c(1:ngroup),beta,theta), each = nobs) * rnorm(ngroup * nobs) + rep(g(c(1:ngroup), beta), each = nobs)
simdf <- matrix(simdf, nrow = ngroup, ncol = nobs, byrow =TRUE)
rownames(simdf) <- sapply (c(1:ngroup), function (x) {paste ("Age", x)})
colnames(simdf) <- sapply (c(1:nobs), function (x) {paste ("Sim", x)})
simdf
}
beta <- c(.9, 1.3)
theta <- 1.25
ngroup <- 12
nobs <- 3
df <-simdf(beta, theta, ngroup, nobs)
df
#> Sim 1 Sim 2 Sim 3
#> Age 1 0.9033467 0.8507356 0.9984951
#> Age 2 1.2923074 1.6851009 2.2794258
#> Age 3 2.7040723 1.4707111 5.3118565
#> Age 4 6.5374094 5.7034016 3.6468092
#> Age 5 9.3370048 6.5639641 10.0828591
#> Age 6 11.6529399 10.7226128 11.3606863
#> Age 7 3.0011910 12.3686324 9.4635935
#> Age 8 9.6740168 22.4326982 11.1432323
#> Age 9 19.4977778 25.4934411 11.9498960
#> Age 10 16.5652002 20.1134084 7.3869584
#> Age 11 30.1968810 20.2685602 22.2894937
#> Age 12 19.0482249 4.4001714 38.7836018
# Construct a scatter plot of concentration against age
matplot(df,type = 'p',pch = 16, col = 1, xlab = "Age i", ylab = "Simulated Yij")
\[\begin{align} MSE (\hat\beta_{0,U} )& = E(\hat\beta_{0, U} - \beta_0)^2 \\ MSE (\hat\beta_{1,U} )& = E(\hat\beta_{1, U} - \beta_1)^2 \end{align}\]
\[ b_M(\beta_{0,U}) = M^{-1} \sum_{m =1}^M (\hat \beta_{0,U}^{(m)} - \beta_0)\] where m denotes the Monte Carlo sample, and \(\hat \beta_{0,U}^{(m)}\) is the estimator of \(\beta_0\) based on Monte Carlo sample \(m\). Give a formula for the Monte Carlo approximation for the mean squared errors that you defined in 2 above.
Consider there are in toal \(M\) Monte Carlo samples with \(m =1,\ldots, M\) representing each of them. For each of the sample we are able to get an estimate \(\hat\beta_{0,U}^{(m)}\), the MSE can be estimated by \[\begin{align} \widehat {MSE} (\hat\beta_{0,U} )& = \frac{1}{M} \sum_{m=1}^M (\hat\beta_{0, U}^{(m)} - \bar\beta_0)^2 \\ \widehat {MSE} (\hat\beta_{1,U} )& = \frac{1}{M} \sum_{m=1}^M (\hat\beta_{1, U}^{(m)} - \bar\beta_1)^2 \\ \end{align}\] where \(\bar \beta _ i = \frac{1}{M} \sum_{m=1}^M \hat\beta^{(m)}_{i, U}, i = 0,1.\) In our simulation study, the true parameters are known, therefore we plug in the true value \(\bar\beta_i = \beta_i, i = 0,1\).
# weighted square error
Qb <- function(df, beta, weight){
if(length(weight)!= nrow(df)){warning("length of weight does not match the number of groups")}
else
Qb <- sum ( t (weight) %*% (df - rep(g(1:nrow(df) , beta) ,ncol(df))) ^2 )
Qb
}
# In each simulation, estimate beta and store them
M <- 50
beta0.sim <- c();beta1.sim <- c()
for (i in 1:M){
df <- simdf(beta, theta, ngroup, nobs)
s <- apply(df, 1, sd)
weight <- 1/s^2
beta.m <- optim(c(0,0),function(x) {Qb(df, x,weight)})
beta0.sim <- c(beta0.sim, beta.m$par[1])
beta1.sim <- c(beta1.sim, beta.m$par[2])
}
# calculte bias, variance, mse
bias <- c(mean (beta0.sim - beta[1]), mean (beta1.sim - beta[2]))
var <- c(var(beta0.sim),var(beta1.sim))
mse <- c(mean((beta0.sim - beta[1])^2) ,mean((beta1.sim - beta[1])^2) )
summary.U.50 <- round(data.frame("Bias" = bias, "Variance" = var, "MSE" = mse),4)
rownames(summary.U.50) <- c("beta(0,U)","beta(1,U)")
summary.U.50
#> Bias Variance MSE
#> beta(0,U) -0.0284 0.0146 0.0151
#> beta(1,U) 0.0290 0.0088 0.1927
#> Bias Variance MSE
#> beta(0,U) 0.0126 0.0185 0.0187
#> beta(1,U) -0.0053 0.0107 0.0107
Note the true model setting is equivalent to \[Y_{ij} = g_1(x_{ij},\beta) +\sigma g_2(x_{ij}, \beta, \theta)\epsilon_{ij}\] where \[g_1(x_{ij}, \beta) = \beta_0 i^{\beta_1}, \quad g_2(x_{ij}, \beta, \theta) = g_1(x_{ij}, \beta)^\theta\] \[\quad \theta = 1.25, \quad \sigma = 0.2 \] Objective is to estimate \(\beta_0, \beta_1\) in the model.
To estimate an initial pair of \(\beta\), note that take logarithm on the both sides of \(g_1\) gives us \(\log(g_1(x,\beta)) = \log(\beta_0) + \beta_1 \log(x)\). Also note \(E(Y_{ij}) = g_1(i, \beta)\), we can perform a OLS of \(\log(E(Y_{ij}))\) against \(\log(i)\)
# specify g1, g2 and v
g1fun <- function(par, X){
par[1] *X^ par[2]
}
g2fun <- function(par, X, theta){
g1fun(par, X )^theta
}
derfun <- function(par, X){
cbind( X^ par[2], par[1] * X^par[2]*log(X) )
}
vhatmufun <- function(par, derfun, X, vcov){
diag(derfun(par, X)%*%vcov%*%t(derfun(par, X)))
}
# monte carlo and parameter estimation
beta <- c(.9, 1.3)
theta <- 1.25
ngroup <- 12
nobs <- 3
M <- 50
beta0.sim <- c();beta1.sim <- c()
for (i in 1:M){
df <- simdf(beta, theta, ngroup, nobs)
# transform the original matrix form of simulation in to two vectors
Y <- as.vector(df)
X <- rep(1:ngroup, 3)
## parameter initialization
## NOTE: this ols can give rise to NaN due to some negative Y generated, therefore there is a truncation applied
res.ols <- lm(log(sapply(Y, max, .001))~log(X)); res.ols$coefficients
alphainit <-exp(summary(res.ols)$coeff[1])
betainit <- summary(res.ols)$coeff[2]
# parameter estimation
glsoutput <- glsiter(c(alphainit, betainit), g1fun, g2fun, derfun, theta = 1.25, Y, X, 3)
beta0.sim <- c(beta0.sim, glsoutput$par[1])
beta1.sim <- c(beta1.sim, glsoutput$par[2])
}
bias <- c(mean (beta0.sim - beta[1]), mean (beta1.sim - beta[2]))
var <- c(var(beta0.sim),var(beta1.sim))
mse <- c(mean((beta0.sim - beta[1])^2) ,mean((beta1.sim - beta[2])^2) )
summary.T.50 <- round(data.frame("Bias" = bias, "Variance" = var, "MSE" = mse),4)
rownames(summary.T.50) <- c("beta(0,T)", "beta(1,T)")
summary.T.50
#> Bias Variance MSE
#> beta(0,T) 0.0109 0.0053 0.0053
#> beta(1,T) -0.0046 0.0032 0.0031
# specify g1, g2 and v
g1fun <- function(par, X){
par[1] *X^ par[2]
}
g2fun <- function(par, X, theta){
g1fun(par, X )^theta
}
derfun <- function(par, X){
cbind( X^ par[2], par[1] * X^par[2]*log(X) )
}
vhatmufun <- function(par, derfun, X, vcov){
diag(derfun(par, X)%*%vcov%*%t(derfun(par, X)))
}
# monte carlo and parameter estimation
beta <- c(.9, 1.3)
theta <- 1.25
ngroup <- 12
nobs <- 3
M <- 50
beta0.sim <- c();beta1.sim <- c()
for (i in 1:M){
df <- simdf(beta, theta, ngroup, nobs)
# transform the original matrix form of simulation in to two vectors
Y <- as.vector(df)
X <- rep(1:ngroup, 3)
# parameter initialization
res.ols <- lm(log(sapply(Y, max, .001))~log(X)); res.ols$coefficients
alphainit <-exp(summary(res.ols)$coeff[1])
betainit <- summary(res.ols)$coeff[2]
# parameter estimation
glsoutput <- glsiter(c(alphainit, betainit), g1fun, g2fun, derfun, theta = 1, Y, X, 3)
beta0.sim <- c(beta0.sim, glsoutput$par[1])
beta1.sim <- c(beta1.sim, glsoutput$par[2])
}
bias <- c(mean (beta0.sim - beta[1]), mean (beta1.sim - beta[2]))
var <- c(var(beta0.sim),var(beta1.sim))
mse <- c(mean((beta0.sim - beta[1])^2) ,mean((beta1.sim - beta[2])^2) )
summary.G.50 <- round(data.frame("Bias" = bias, "Variance" = var, "MSE" = mse),4)
rownames(summary.G.50) <- c("beta(0,G)","beta(1,G)")
summary.G.50
#> Bias Variance MSE
#> beta(0,G) 0.0065 0.0071 0.0070
#> beta(1,G) -0.0005 0.0038 0.0038
#> Bias Variance MSE
#> beta(0,G) 0.0023 0.0080 0.0079
#> beta(1,G) -0.0002 0.0048 0.0048
In summary, when \(M = 50\),
#> Bias Variance MSE
#> beta(0,U) -0.0284 0.0146 0.0151
#> beta(1,U) 0.0290 0.0088 0.1927
#> beta(0,G) 0.0065 0.0071 0.0070
#> beta(1,G) -0.0005 0.0038 0.0038
#> beta(0,T) 0.0109 0.0053 0.0053
#> beta(1,T) -0.0046 0.0032 0.0031
Similarly, we can also perform monte carlo when \(M = 500\).
#> Bias Variance MSE
#> beta(0,U) 0.0126 0.0185 0.0187
#> beta(1,U) -0.0053 0.0107 0.0107
#> beta(0,G) 0.0023 0.0080 0.0079
#> beta(1,G) -0.0002 0.0048 0.0048
#> beta(0,T) 0.0003 0.0074 0.0074
#> beta(1,T) 0.0027 0.0041 0.0041
For the ease of computation, Monte Carlo processes of size 50 and 500 are applied here. Note the MSE of \(\hat\beta_G\) and \(\hat\beta_T\) are consistenly smaller than that of \(\hat\beta_U\). Even \(\hat\beta_G\) does not perfectly reflect the true model, the modelling of \(g_2\) as a function of \(g_1\) improves the efficiency in estimation significantly.
Let \(Y_i\) denote the \(i\)-th observation of OXY, and let \(X_i\) denote the \(i\)-th observation of body mass. Since the interest of us is to relate OXY concentration to body mass. We propose the conditional distribution of \(Y_i\) given \(\theta_i\) follows a exponential dispersion family as follwing
\[ f(y_i | \theta_i,\phi) = \exp(\phi_i(y_i \theta_i - b(\theta_i)) + c(y_i, \phi)), \quad y_u \in \Omega_y \]
Recall the moments of \(Y_i\) as following \[\begin{align} E(Y_i|\theta_i) & = b'(\theta_i) = \mu_i\\ V(Y_i|\theta_i) & =\frac{1}{\phi} b''(\theta_i) = \frac{1}{\phi}V(\mu_i) \end{align}\]
Since we apply log link, we have \(\mu_i \equiv \exp(x_i \beta) = \exp(\beta_0+ \beta_1 x_i)\). Also we assume a gamma random component, \[ \begin{aligned} f\left(y_{i} | \theta_{i}, \phi\right) &=\exp \left[\phi\left\{y \theta_{i}-b\left(\theta_{i}\right)\right\}+c\left(y_{i}, \phi\right)\right] \\ \phi &=\alpha, \quad \theta_{i}=-\frac{\beta_{i}}{\alpha} \\ b\left(\theta_{i}\right) &=\log \left(-\frac{1}{\theta_{i}}\right) \\ c(y, \phi) &=-\log (\Gamma(\phi))+(\phi-1) \log \left(y_{i}\right)+\phi \log (\phi)\\ b^{\prime}\left(\theta_{i}\right) &=-\frac{1}{\theta_{i}} \\ \implies \theta_i& = -\frac{1}{b'(\theta)} = -\frac{1}{\exp(\beta_0+ \beta_1 x_i)} \end{aligned} \]
Note that the objective is to estimate \(P(Y> 6| X = 1700)\), which can be written into the following \[\begin{align} q : = P(Y> 6| x = 1700) & = \int_6^\infty f(y|x = 1700, \theta, \phi) dy\\ & = 1 - \int_{-\infty}^6 f(y|x = 1700, \theta, \phi) dy \end{align} \]
# Fit GLM use function `gammafit'
library(MASS) # in order to call boxcox
df <- read.csv("gullsdata2019.csv")
X <- df$bm
Y <- df$oxy
gammafit <- glm(Y~X, family = Gamma(link = "log"))
summary(gammafit)
#>
#> Call:
#> glm(formula = Y ~ X, family = Gamma(link = "log"))
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -1.7413 -0.7697 -0.2940 0.2609 2.3873
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.9483048 0.7724320 -1.228 0.222
#> X 0.0024149 0.0004656 5.187 1.02e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for Gamma family taken to be 0.8545051)
#>
#> Null deviance: 96.639 on 108 degrees of freedom
#> Residual deviance: 76.478 on 107 degrees of freedom
#> AIC: 873.71
#>
#> Number of Fisher Scoring iterations: 6
B <- 500
b <- 0
qhatstarBs <- c()
repeat{
b <- b + 1
Ystar <- simulate(gammafit)$sim_1
gammafitstar <- glm(Ystar~X, family = Gamma(link = "log"))
shapestar <- 1/summary(gammafitstar)$dispersion
mustar <- exp(gammafitstar$coef[1] + gammafitstar$coef[2]*1700)
scalestar <- mustar/shapestar
qhatstarBs <- c(qhatstarBs, 1 - pgamma(6, shape = shapestar, scale = scalestar))
if(b == B){break}
}
# Percentile Bootstrap confidence interval:
c(quantile(qhatstarBs, prob = 0.025), quantile(qhatstarBs, prob = 0.975))
#> 2.5% 97.5%
#> 0.7956354 0.9229331
First, consider an additive error model of the form \[ Y_{i}=\mu\left(x_{i}, \boldsymbol{\beta}\right)+\sigma \mu\left(x_{i}, \boldsymbol{\beta}\right)^{\theta} \epsilon_{i} \] where \(\epsilon_{i} \stackrel{\text { iid }}{\sim}(0,1)\) for \(i=1, \ldots, n, \boldsymbol{\beta}=\left(\beta_{0}, \beta_{1}\right)\) \[\mu\left(x_{i}, \boldsymbol{\beta}\right)^{\theta}=\frac{1}{\beta_{0}+\beta_{1} x_{i}}\]
In this exercise, \(\theta =1\).
# read data
df <- read.csv("salmon.csv")
s <- df$spawners; X<- 1/s
R <- df$recruits; Y<- R
res.ols <- lm((1/Y)~X); res.ols$coefficients
#> (Intercept) X
#> 0.0005278397 0.2080778421
alphainit <-summary(res.ols)$coeff[1]
betainit <- summary(res.ols)$coeff[2]
The estimate of intercept is 510^{-4}, the estimate of slope is 0.2081.
g1fun <- function(par, X){
1/(par[1] + par[2]* X)
}
g2fun <- function(par, X, theta){
g1fun(par, X )^theta
}
derfun <- function(par, X){
cbind( -1/((par[1] + par[2]* X)^2), -X/((par[1] + par[2]* X)^2) )
}
vhatmufun <- function(par, derfun, X, vcov){
diag(derfun(par, X)%*%vcov%*%t(derfun(par, X)))
}
glsoutput <- glsiter(c(alphainit, betainit), g1fun, g2fun, derfun, theta = 1, Y, X, 3)
sehatmu <- sqrt(vhatmufun(glsoutput$par, derfun, X, glsoutput$Vhat))
plot(X, Y)
lines(X[order(X)], g1fun(glsoutput$par, X)[order(X)])
lines(X[order(X)], g1fun(glsoutput$par, X)[order(X)] + 1.96*sehatmu[order(X)], lty = 2)
lines(X[order(X)], g1fun(glsoutput$par, X)[order(X)] - 1.96*sehatmu[order(X)], lty = 2)
glsoutput
#> $par
#> [,1]
#> [1,] 0.0003174447
#> [2,] 0.2422520821
#>
#> $sig2hat
#> [1] 0.1548461
#>
#> $Vhat
#> [,1] [,2]
#> [1,] 1.563763e-08 -7.428111e-06
#> [2,] -7.428111e-06 4.448430e-03
The parameter \(h(\beta_0) = 1/\beta _0\) has an interpretation as the maximum carrying capacity of the fishery. Apply delta method, we have 95% confidence interval for \(h(\beta_0)\) is \[\begin{align} &h(\hat\beta_0) \pm 1.96 \sqrt{Var(\hat\beta_0) \left(\left.\frac{\partial h(\beta_0)}{\partial \beta_0} \right|_{\beta_0 =\hat\beta_0}\right)^2}\\ & =h(\hat\beta_0) \pm 1.96 \sqrt{Var(\hat\beta_0) \left(\frac{-1}{\hat\beta_0^2}\right)^2}\\ & = \frac{1}{ 0.0003174447} \pm 1.96 \times \sqrt{1.563763*10^{-08} * \frac{1}{0.0003174447^4}}\\ & =( 717.9209,5582.389) \end{align}\]
Fit a generalized linear model to these data using a gamma random component and an inverse link function. Compare the estimate of β obtained from the gamma GLM to the GLS estimate of β obtained in (b) above.
# read data
df <- read.csv("salmon.csv")
s <- df$spawners; X<- 1/s
R <- df$recruits; Y<- R
gammafit <- glm(Y~X, family = Gamma(link = "inverse"))
summary(gammafit)$coef
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.0003174622 0.0001250536 2.538610 0.017732154
#> X 0.2422474710 0.0666976881 3.632022 0.001266152
State one reason that GLM might be preferred for this study. State one reason that an additive error model might be preferred for this study. Which of the two models would you recommend for this study? (This question has multiple correct answers.)
Compare the Variances of estimators of these two methods
# Additive Error Model
sd.1 <- sqrt(diag(glsoutput$Vhat))
mu.1 <- as.vector(glsoutput$par)
#GLM
sd.2 <- summary(gammafit)$coeff[,2]
mu.2 <- as.vector(summary(gammafit)$coeff[,1])
rbind(mu.1, mu.2)
#> [,1] [,2]
#> mu.1 0.0003174447 0.2422521
#> mu.2 0.0003174622 0.2422475
rbind(sd.1, sd.2)
#> (Intercept) X
#> sd.1 0.0001250505 0.06669655
#> sd.2 0.0001250536 0.06669769
Note the mean and variances of the estimators are very close. Trace back to the estimation process, additive error model requires the calculation of the derivative functions of \(g_1\) wrt \(\beta\), which can go complex when \(g_1\) is not in simple form. In constrast, GLM restricted the distribution in exponential family with much easier application due to nice properties of the family. However when it comes to estimate the parameters with verified theoretical preparation in \(g_1\) and \(g_2\), additive error model may be preferred.