TP2 – Codend–Cover Experiments (Logistic Selectivity via GLM / ML)

Course: MSc Fisheries — Gear Selectivity
Goal: Fit a logistic selectivity curve using codend–cover data (binomial GLM; ML)

Learning outcomes

By the end of this practical, students should be able to:

Explain the codend–cover experimental design and the data structure.
Fit a logistic selectivity curve using a binomial GLM (maximum likelihood).
Derive and interpret L25, L50, L75 and SR = L75 − L25.
Critically assess experimental design (sampling around the selection range, extrapolation risks).
Compare selectivity between two gears (e.g., different mesh sizes).

Packages

library(ggplot2)

1. Data

In a codend–cover experiment, fish that enter the codend are either:

retained in the codend, or
escape and are collected in the cover.

For each length class l:

codend_l = number retained
cover_l = number escaped
total_l = codend_l + cover_l
observed selectivity: r(l) = codend_l / total_l

# Codend–cover experimental data (example values)
dat <- data.frame(
  length = c(19.5,20.5,21.5,22.5,23.5,24.5,
             25.5,26.5,27.5,28.5,29.5,30.5),
  codend = c(1,1,2,10,13,10,14,7,1,1,2,1),
  cover  = c(4,25,69,151,190,192,146,102,62,24,4,8)
)

dat$total   <- dat$codend + dat$cover
dat$sel_obs <- dat$codend / dat$total

print(dat)

##    length codend cover total    sel_obs
## 1    19.5      1     4     5 0.20000000
## 2    20.5      1    25    26 0.03846154
## 3    21.5      2    69    71 0.02816901
## 4    22.5     10   151   161 0.06211180
## 5    23.5     13   190   203 0.06403941
## 6    24.5     10   192   202 0.04950495
## 7    25.5     14   146   160 0.08750000
## 8    26.5      7   102   109 0.06422018
## 9    27.5      1    62    63 0.01587302
## 10   28.5      1    24    25 0.04000000
## 11   29.5      2     4     6 0.33333333
## 12   30.5      1     8     9 0.11111111

Plot observed selectivity (raw proportions)

ggplot(dat, aes(x = length, y = sel_obs)) +
  geom_point(size = 2) +
  labs(
    x = "Length (cm)",
    y = "Observed selectivity (proportion in codend)",
    title = "Codend–cover experiment: observed selectivity"
  ) +
  theme_minimal(base_size = 13)

2. Conceptual logistic selectivity curve (example curve, not fitted to data)

Before fitting a model, we visualize a typical logistic curve:

\[ p(L) = \frac{1}{1 + \exp(-(a + bL))} \]

More interpretable parameters:

L50: length where \(p=0.5\)
SR: selection range \(= L75 - L25\) (controls steepness)

For a logistic curve:

\[ SR = \frac{2\ln(3)}{b} \quad \Rightarrow \quad b = \frac{2\ln(3)}{SR} \]

L50_concept <- 30
SR_concept  <- 6

b_concept <- 2 * log(3) / SR_concept
a_concept <- -b_concept * L50_concept

x_concept <- seq(L50_concept - 4*SR_concept,
                 L50_concept + 4*SR_concept,
                 length.out = 200)

p_concept <- 1 / (1 + exp(-(a_concept + b_concept * x_concept)))
concept_df <- data.frame(length = x_concept, p = p_concept)

ggplot(concept_df, aes(x = length, y = p)) +
  geom_line(linewidth = 1.2) +
  geom_hline(yintercept = 0.5, linetype = "dashed") +
  geom_vline(xintercept = L50_concept, linetype = "dotted") +
  coord_cartesian(ylim = c(0, 1)) +
  labs(
    x = "Length (cm)",
    y = "Retention probability",
    title = "Typical logistic selectivity curve (general example)"
  ) +
  theme_minimal(base_size = 13)

3. Fit a logistic (binomial) GLM — maximum likelihood to the observed data

Data model:

\[ \text{codend}_l \sim \text{Binomial}(\text{total}_l, p_l) \]

Link function:

\[ \text{logit}(p_l) = \ln\left(\frac{p_l}{1-p_l}\right) = a + b\,L_l \]

fit_glm <- glm(
  cbind(codend, cover) ~ length,
  data   = dat,
  family = binomial(link = "logit")
)

summary(fit_glm)

## 
## Call:
## glm(formula = cbind(codend, cover) ~ length, family = binomial(link = "logit"), 
##     data = dat)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -3.79258    1.60242  -2.367   0.0179 *
## length       0.04297    0.06509   0.660   0.5091  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 13.019  on 11  degrees of freedom
## Residual deviance: 12.587  on 10  degrees of freedom
## AIC: 51.656
## 
## Number of Fisher Scoring iterations: 4

a_glm <- coef(fit_glm)[1]; a_glm

## (Intercept) 
##   -3.792585

b_glm <- coef(fit_glm)[2]; b_glm

##     length 
## 0.04297322

4. Derive L25, L50, L75 and SR

From \(\text{logit}(p) = a + bL\):

For \(p=0.5\): \(\text{logit}(0.5)=0\) → \(L50 = -a/b\)

L50_glm <- -a_glm / b_glm

logit25 <- log(0.25 / 0.75)
logit75 <- log(0.75 / 0.25)

L25_glm <- (logit25 - a_glm) / b_glm
L75_glm <- (logit75 - a_glm) / b_glm
SR_glm  <- L75_glm - L25_glm

cat("\nLogistic GLM (binomial, maximum likelihood):\n")

## 
## Logistic GLM (binomial, maximum likelihood):

cat("  L25 =", round(L25_glm, 2), "cm\n")

##   L25 = 62.69 cm

cat("  L50 =", round(L50_glm, 2), "cm\n")

##   L50 = 88.25 cm

cat("  L75 =", round(L75_glm, 2), "cm\n")

##   L75 = 113.82 cm

cat("  SR  =", round(SR_glm,  2), "cm\n\n")

##   SR  = 51.13 cm

5. Plot the fitted curve vs observed points

x_grid <- seq(min(dat$length), max(dat$length), length.out = 200)
p_grid <- 1 / (1 + exp(-(a_glm + b_glm * x_grid)))

pred_df <- data.frame(length = x_grid, sel = p_grid)

ggplot() +
  geom_point(data = dat, aes(x = length, y = sel_obs), size = 2) +
  geom_line(data = pred_df, aes(x = length, y = sel), linewidth = 1.1) +
  geom_hline(yintercept = 0.5, linetype = "dashed") +
  geom_vline(xintercept = L50_glm, linetype = "dotted") +
  annotate("text", x = L50_glm, y = 0.05,
           label = paste0("L50 = ", round(L50_glm, 1), " cm"),
           angle = 90, vjust = -0.5, hjust = 0) +
  coord_cartesian(ylim = c(0, 1)) +
  labs(
    x = "Length (cm)",
    y = "Retention probability",
    title = "Codend selectivity – logistic GLM fit"
  ) +
  theme_minimal(base_size = 13)

Questions for class discussion

Can you explain the plot and the fitted curve? What seems to be the problem here?
Discuss:
- experimental design and sampling across the selection range
- consequences of extrapolation
- what additional data would improve the fit?

6. Exercise: repeat with a smaller mesh size

The expectation is: smaller mesh → more retention at smaller sizes → lower L50 (in principle).

dat2 <- data.frame(
  length = c(19.5,20.5,21.5,22.5,23.5,24.5,
             25.5,26.5,27.5,28.5,29.5,30.5),
  codend = c( 4,  3,  9, 14, 22, 30,
              44, 55, 63, 58, 41, 30),
  cover  = c(76, 72, 66, 64, 58, 50,
             46, 35, 27, 22, 17, 10)
)
print(dat2)

##    length codend cover
## 1    19.5      4    76
## 2    20.5      3    72
## 3    21.5      9    66
## 4    22.5     14    64
## 5    23.5     22    58
## 6    24.5     30    50
## 7    25.5     44    46
## 8    26.5     55    35
## 9    27.5     63    27
## 10   28.5     58    22
## 11   29.5     41    17
## 12   30.5     30    10

## Second dataset (smaller mesh) — fitted parameters:

##   L50 = 26.04 cm

##   SR  = 5.38 cm

Make the predictions for the smaller mesh

Make plot for the smaller mesh

Make plot overlay with both curves

7. Wrap-up discussion points

Which gear has lower L50? Why?
Did SR change? What does that imply about the gear?
What assumptions are we making (independence by length class, representativeness, length measurement error, etc.)?

cat("\nWELL DONE! You have now fitted and understand your first size selectivity model! \n")

## 
## WELL DONE! You have now fitted and understand your first size selectivity model!