Course: MSc Fisheries — Gear Selectivity
Goal: Fit a logistic selectivity curve using codend–cover data (binomial GLM; ML)

Learning outcomes

By the end of this practical, students should be able to:

  1. Explain the codend–cover experimental design and the data structure.
  2. Fit a logistic selectivity curve using a binomial GLM (maximum likelihood).
  3. Derive and interpret L25, L50, L75 and SR = L75 − L25.
  4. Critically assess experimental design (sampling around the selection range, extrapolation risks).
  5. Compare selectivity between two gears (e.g., different mesh sizes).

Packages

library(ggplot2)

1. Data

In a codend–cover experiment, fish that enter the codend are either:

For each length class l:

# Codend–cover experimental data (example values)
dat <- data.frame(
  length = c(19.5,20.5,21.5,22.5,23.5,24.5,
             25.5,26.5,27.5,28.5,29.5,30.5),
  codend = c(1,1,2,10,13,10,14,7,1,1,2,1),
  cover  = c(4,25,69,151,190,192,146,102,62,24,4,8)
)

dat$total   <- dat$codend + dat$cover
dat$sel_obs <- dat$codend / dat$total

print(dat)
##    length codend cover total    sel_obs
## 1    19.5      1     4     5 0.20000000
## 2    20.5      1    25    26 0.03846154
## 3    21.5      2    69    71 0.02816901
## 4    22.5     10   151   161 0.06211180
## 5    23.5     13   190   203 0.06403941
## 6    24.5     10   192   202 0.04950495
## 7    25.5     14   146   160 0.08750000
## 8    26.5      7   102   109 0.06422018
## 9    27.5      1    62    63 0.01587302
## 10   28.5      1    24    25 0.04000000
## 11   29.5      2     4     6 0.33333333
## 12   30.5      1     8     9 0.11111111

Plot observed selectivity (raw proportions)

ggplot(dat, aes(x = length, y = sel_obs)) +
  geom_point(size = 2) +
  labs(
    x = "Length (cm)",
    y = "Observed selectivity (proportion in codend)",
    title = "Codend–cover experiment: observed selectivity"
  ) +
  theme_minimal(base_size = 13)

2. Conceptual logistic selectivity curve (example curve, not fitted to data)

Before fitting a model, we visualize a typical logistic curve:

\[ p(L) = \frac{1}{1 + \exp(-(a + bL))} \]

More interpretable parameters:

For a logistic curve:

\[ SR = \frac{2\ln(3)}{b} \quad \Rightarrow \quad b = \frac{2\ln(3)}{SR} \]

L50_concept <- 30
SR_concept  <- 6

b_concept <- 2 * log(3) / SR_concept
a_concept <- -b_concept * L50_concept

x_concept <- seq(L50_concept - 4*SR_concept,
                 L50_concept + 4*SR_concept,
                 length.out = 200)

p_concept <- 1 / (1 + exp(-(a_concept + b_concept * x_concept)))
concept_df <- data.frame(length = x_concept, p = p_concept)

ggplot(concept_df, aes(x = length, y = p)) +
  geom_line(linewidth = 1.2) +
  geom_hline(yintercept = 0.5, linetype = "dashed") +
  geom_vline(xintercept = L50_concept, linetype = "dotted") +
  coord_cartesian(ylim = c(0, 1)) +
  labs(
    x = "Length (cm)",
    y = "Retention probability",
    title = "Typical logistic selectivity curve (general example)"
  ) +
  theme_minimal(base_size = 13)

3. Fit a logistic (binomial) GLM — maximum likelihood to the observed data

Data model:

\[ \text{codend}_l \sim \text{Binomial}(\text{total}_l, p_l) \]

Link function:

\[ \text{logit}(p_l) = \ln\left(\frac{p_l}{1-p_l}\right) = a + b\,L_l \]

fit_glm <- glm(
  cbind(codend, cover) ~ length,
  data   = dat,
  family = binomial(link = "logit")
)

summary(fit_glm)
## 
## Call:
## glm(formula = cbind(codend, cover) ~ length, family = binomial(link = "logit"), 
##     data = dat)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -3.79258    1.60242  -2.367   0.0179 *
## length       0.04297    0.06509   0.660   0.5091  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 13.019  on 11  degrees of freedom
## Residual deviance: 12.587  on 10  degrees of freedom
## AIC: 51.656
## 
## Number of Fisher Scoring iterations: 4
a_glm <- coef(fit_glm)[1]; a_glm
## (Intercept) 
##   -3.792585
b_glm <- coef(fit_glm)[2]; b_glm
##     length 
## 0.04297322

4. Derive L25, L50, L75 and SR

From \(\text{logit}(p) = a + bL\):

L50_glm <- -a_glm / b_glm

logit25 <- log(0.25 / 0.75)
logit75 <- log(0.75 / 0.25)

L25_glm <- (logit25 - a_glm) / b_glm
L75_glm <- (logit75 - a_glm) / b_glm
SR_glm  <- L75_glm - L25_glm

cat("\nLogistic GLM (binomial, maximum likelihood):\n")
## 
## Logistic GLM (binomial, maximum likelihood):
cat("  L25 =", round(L25_glm, 2), "cm\n")
##   L25 = 62.69 cm
cat("  L50 =", round(L50_glm, 2), "cm\n")
##   L50 = 88.25 cm
cat("  L75 =", round(L75_glm, 2), "cm\n")
##   L75 = 113.82 cm
cat("  SR  =", round(SR_glm,  2), "cm\n\n")
##   SR  = 51.13 cm

5. Plot the fitted curve vs observed points

x_grid <- seq(min(dat$length), max(dat$length), length.out = 200)
p_grid <- 1 / (1 + exp(-(a_glm + b_glm * x_grid)))

pred_df <- data.frame(length = x_grid, sel = p_grid)

ggplot() +
  geom_point(data = dat, aes(x = length, y = sel_obs), size = 2) +
  geom_line(data = pred_df, aes(x = length, y = sel), linewidth = 1.1) +
  geom_hline(yintercept = 0.5, linetype = "dashed") +
  geom_vline(xintercept = L50_glm, linetype = "dotted") +
  annotate("text", x = L50_glm, y = 0.05,
           label = paste0("L50 = ", round(L50_glm, 1), " cm"),
           angle = 90, vjust = -0.5, hjust = 0) +
  coord_cartesian(ylim = c(0, 1)) +
  labs(
    x = "Length (cm)",
    y = "Retention probability",
    title = "Codend selectivity – logistic GLM fit"
  ) +
  theme_minimal(base_size = 13)

Questions for class discussion

  1. Can you explain the plot and the fitted curve? What seems to be the problem here?
  2. Discuss:
    • experimental design and sampling across the selection range
    • consequences of extrapolation
    • what additional data would improve the fit?

6. Exercise: repeat with a smaller mesh size

The expectation is: smaller mesh → more retention at smaller sizes → lower L50 (in principle).

dat2 <- data.frame(
  length = c(19.5,20.5,21.5,22.5,23.5,24.5,
             25.5,26.5,27.5,28.5,29.5,30.5),
  codend = c( 4,  3,  9, 14, 22, 30,
              44, 55, 63, 58, 41, 30),
  cover  = c(76, 72, 66, 64, 58, 50,
             46, 35, 27, 22, 17, 10)
)
print(dat2)
##    length codend cover
## 1    19.5      4    76
## 2    20.5      3    72
## 3    21.5      9    66
## 4    22.5     14    64
## 5    23.5     22    58
## 6    24.5     30    50
## 7    25.5     44    46
## 8    26.5     55    35
## 9    27.5     63    27
## 10   28.5     58    22
## 11   29.5     41    17
## 12   30.5     30    10
## Second dataset (smaller mesh) — fitted parameters:
##   L50 = 26.04 cm
##   SR  = 5.38 cm

Make the predictions for the smaller mesh

Make plot for the smaller mesh

Make plot overlay with both curves

7. Wrap-up discussion points


cat("\nWELL DONE! You have now fitted and understand your first size selectivity model! \n")
## 
## WELL DONE! You have now fitted and understand your first size selectivity model!