Introduction

Learning Objectives

By the end of this practical, you will be able to:

  1. Process and visualize catch-at-length data from two different mesh sizes
  2. Apply the Holt (1963) catch-ratio method to estimate the cross-over length
  3. Understand and implement the SELECT framework (Millar & Fryer, 1999) using maximum likelihood
  4. Fit and compare different selectivity curve models (normal vs. logistic)
  5. Interpret selectivity parameters and their implications for fisheries management

Background: Why Selectivity Matters

Fishing gears do not capture all sizes of fish equally. Selectivity describes the probability that a fish of a given size is retained by the gear once it encounters it. Understanding selectivity is crucial for:

  • Setting appropriate mesh size regulations
  • Protecting juvenile fish below maturity
  • Estimating fishing mortality by size class
  • Designing sustainable fishing practices

The Two-Mesh Experiment

A classic experimental design compares catches between two mesh sizes fished simultaneously (or alternating) in the same area. This controls for differences in fish availability, allowing us to isolate the effect of mesh size on selectivity.

Key assumption: The only difference between the catches is due to selectivity and relative fishing power—the fish population available to both gears is identical.


Part 1: Setup and Data Preparation

Load Required Packages

First, let’s load the packages we’ll need. If you haven’t installed them, uncomment the install.packages() line.

# Install packages if needed (uncomment the line below if first time)
# install.packages(c("ggplot2", "dplyr", "tidyr"))

library(ggplot2)
library(dplyr)
library(tidyr)

# Set a clean theme for all plots
theme_set(theme_bw(base_size = 12))

Load the Data

We have length measurements from fish caught with two different mesh sizes. Load your data file and define the mesh sizes used in the experiment.

# Load the catch data
# IMPORTANT: Update the path below to match your file location
# setwd("YOUR/WORKING/DIRECTORY")
df <- read.csv("TP3_TwoMesh_LengthMeasurements.csv")

# Define mesh sizes (in mm)
mesh1 <- 80   # Smaller mesh
mesh2 <- 100  # Larger mesh

# Quick look at the data structure
str(df)
## 'data.frame':    880 obs. of  3 variables:
##  $ fish_id  : chr  "M80_0001" "M80_0002" "M80_0003" "M80_0004" ...
##  $ mesh     : int  80 80 80 80 80 80 80 80 80 80 ...
##  $ length_cm: num  84.9 87.1 82.1 97.5 87.9 75.7 86.9 95.7 80.5 83.1 ...
head(df)

Task: Examine the data frame. What variables does it contain? How many fish were measured in total?

Process Data: Create Catch-at-Length Table

Raw length measurements need to be binned into length classes and counted. This creates the catch-at-length matrix that forms the basis of our analysis.

# Bin width (cm) - this groups fish into size classes
# Adjust based on your data resolution (2 cm is common)
bin_width <- 2

# Create length bins (midpoint of each bin)
df_bins <- df %>%
  mutate(length_bin = floor(length_cm / bin_width) * bin_width + bin_width/2)

# Count catches per length-bin and mesh
dat_long <- df_bins %>%
  group_by(length = length_bin, mesh) %>%
  summarise(catch = n(), .groups = "drop")

# Convert to wide format: one column per mesh
dat <- dat_long %>%
  pivot_wider(names_from = mesh, values_from = catch, values_fill = 0) %>%
  rename(C1 = !!as.character(mesh1), C2 = !!as.character(mesh2)) %>%
  arrange(length) %>%
  filter(C1 + C2 > 0)  # Keep only length classes with catches

# View the catch-at-length table
print(dat, n = 25)
## # A tibble: 32 × 3
##    length    C1    C2
##     <dbl> <int> <int>
##  1     65     1     0
##  2     67     6     0
##  3     69    10     0
##  4     71     4     0
##  5     73    12     1
##  6     75    20     0
##  7     77    27     2
##  8     79    29     3
##  9     81    23     3
## 10     83    42     5
## 11     85    38     8
## 12     87    45     9
## 13     89    46    17
## 14     91    42    15
## 15     93    27    22
## 16     95    19    33
## 17     97    24    32
## 18     99    13    28
## 19    101     8    33
## 20    103     5    31
## 21    105     6    27
## 22    107     2    34
## 23    109     1    25
## 24    111     0    20
## 25    113     0    24
## # ℹ 7 more rows

Understanding the output: C1 is the count caught by the smaller mesh (80mm) and C2 is the count caught by the larger mesh (100mm). Why do you think we removed rows where both catches are zero?


Part 2: Visualizing the Data

Before fitting any models, always visualize your data. This helps identify patterns and potential issues.

Catch Distributions by Mesh Size

# Prepare data for plotting
plot_df <- bind_rows(
  data.frame(length = dat$length, mesh = factor(mesh1), catch = dat$C1),
  data.frame(length = dat$length, mesh = factor(mesh2), catch = dat$C2)
)

# Plot catch-at-length distributions
p1 <- ggplot(plot_df, aes(x = length, y = catch, color = mesh, fill = mesh)) +
  geom_area(alpha = 0.3, position = "identity") +
  geom_line(linewidth = 1) +
  scale_color_manual(values = c("#065A82", "#00A896"), name = "Mesh (mm)") +
  scale_fill_manual(values = c("#00A896", "#065A82"), name = "Mesh (mm)") +
  labs(
    title = "Catch-at-Length by Mesh Size",
    x = "Length (cm)",
    y = "Number caught"
  )

print(p1)

Discussion Question: Why is the 100mm mesh distribution shifted to the right compared to the 80mm mesh? What does this tell us about selectivity?


Part 3: Holt (1963) Catch-Ratio Method

The Theory

The Holt method is a classic approach based on the ratio of catches between two mesh sizes. The key insight is:

\[\ln\left(\frac{C_1}{C_2}\right) = a + b \cdot L\]

Where:

  • \(C_1\) = catch in mesh 1 at length \(L\)
  • \(C_2\) = catch in mesh 2 at length \(L\)
  • \(a\) = intercept
  • \(b\) = slope

This relationship is linear in length—simple but powerful!

The cross-over length (\(L^*\)) is where both meshes catch equally (\(C_1 = C_2\)), meaning \(\ln(C_1/C_2) = 0\):

\[L^* = -\frac{a}{b}\]

Fitting the Holt Model

Here we are using weighted linear regression, assuming that length classes with more fish give more reliable ratio estimates.

# Prepare data for Holt analysis
holt_df <- dat %>%
  mutate(total = C1 + C2) %>%
  # Avoid log(0) and require minimum sample size
  filter(C1 > 0, C2 > 0, total >= 5) %>%
  mutate(
    log_ratio = log(C1 / C2),
    weight = total  # More fish = more reliable ratio
  )

# Fit weighted linear regression
holt_fit <- lm(log_ratio ~ length, data = holt_df, weights = weight)

# View regression summary
summary(holt_fit)
## 
## Call:
## lm(formula = log_ratio ~ length, data = holt_df, weights = weight)
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -4.289 -1.976 -0.518  1.384  3.971 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16.533557   0.914797   18.07 4.53e-12 ***
## length      -0.176153   0.009852  -17.88 5.34e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.423 on 16 degrees of freedom
## Multiple R-squared:  0.9523, Adjusted R-squared:  0.9494 
## F-statistic: 319.7 on 1 and 16 DF,  p-value: 5.345e-12

Extract and Interpret Parameters

# Extract coefficients
a_holt <- coef(holt_fit)[1]
b_holt <- coef(holt_fit)[2]

# Calculate cross-over length
L_star <- -a_holt / b_holt

cat(sprintf("  Intercept (a) = %.4f\n", a_holt))
##   Intercept (a) = 16.5336
cat(sprintf("  Slope (b)     = %.5f\n", b_holt))
##   Slope (b)     = -0.17615
cat(sprintf("\n  Cross-over length L* (where both meshes catch equally) = %.1f cm\n", L_star))
## 
##   Cross-over length L* (where both meshes catch equally) = 93.9 cm

Visualize the Holt Regression

# Generate predictions for plotting
pred_holt <- data.frame(length = seq(min(holt_df$length), max(holt_df$length), 1))
pred_holt$log_ratio <- predict(holt_fit, newdata = pred_holt)

# Plot
p2 <- ggplot(holt_df, aes(x = length, y = log_ratio)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  geom_point(aes(size = weight), alpha = 0.6, color = "#065A82") +
  geom_line(data = pred_holt, color = "#F96167", linewidth = 1.2) +
  geom_vline(xintercept = L_star, linetype = "dotted", color = "#00A896", linewidth = 1) +
  annotate("text", x = L_star + 2, y = max(holt_df$log_ratio) * 0.8,
           label = paste0("L* = ", round(L_star, 1), " cm"),
           hjust = 0, color = "#00A896", fontface = "bold") +
  scale_size_continuous(name = "Sample size", range = c(1, 5)) +
  labs(
    title = "Holt (1963) Catch-Ratio Analysis",
    subtitle = "Weighted linear regression of ln(C1/C2) on length",
    x = "Length (cm)",
    y = expression(ln(C[1]/C[2]))
  )

print(p2)

Interpreting L* Under Geometric Similarity

Important: \(L^*\) is NOT the \(L_{50}\) of either mesh! It’s the length where both meshes catch equally: \(r_1(L^*) \cdot P_1 = r_2(L^*) \cdot P_2\)

Under the geometric similarity assumption (selectivity parameters scale proportionally with mesh size), we can estimate the selectivity coefficient \(k\):

# Under geometric similarity: L50 = k × mesh
# L* relates to the geometric mean of optimal lengths
k_holt <- L_star / sqrt(mesh1 * mesh2)

cat(sprintf("Estimated k from Holt method: %.4f\n\n", k_holt))
## Estimated k from Holt method: 1.0494
cat(sprintf("  L50 for %dmm mesh ≈ %.1f cm\n", mesh1, k_holt * mesh1))
##   L50 for 80mm mesh ≈ 84.0 cm
cat(sprintf("  L50 for %dmm mesh ≈ %.1f cm\n", mesh2, k_holt * mesh2))
##   L50 for 100mm mesh ≈ 104.9 cm

Limitations of the Holt method: While quick and intuitive, this approach assumes a linear log-ratio relationship and lacks a proper statistical error structure. The SELECT method addresses these limitations.


Part 4: SELECT Framework (Maximum Likelihood)

Introduction to SELECT

The SELECT method (Millar & Fryer, 1999) is the modern standard for selectivity estimation. Instead of working with catch ratios, it models the probability of capture directly.

Key point: Given that a fish of length \(L\) was caught by one of the two meshes, what is the probability it was caught by mesh 2?

\[P(\text{mesh 2} | \text{caught}) = \frac{P_2 \cdot r_2(L)}{P_1 \cdot r_1(L) + P_2 \cdot r_2(L)}\]

Where:

  • \(r_i(L)\) = selectivity of mesh \(i\) at length \(L\)
  • \(P_i\) = relative fishing power of mesh \(i\)

This conditional probability follows a binomial distribution!

What about mesh 1? Remember that the probabilities are complementary: \(P(\text{mesh 1} | \text{caught}) = 1 - P(\text{mesh 2} | \text{caught})\). The binomial likelihood handles both automatically — when we count y fish caught by mesh 2 out of n total, the remaining n - y caught by mesh 1 are implicit. The choice of mesh 2 as the “success” is arbitrary; using mesh 1 would give identical results with only the sign of alpha flipped.

Define Selectivity Curve Functions

Two common selectivity curves:

  1. Normal (dome-shaped) – typical for gillnets where very large fish can also escape
  2. Logistic (S-shaped) – typical for trawls with a fixed mesh size
# Normal (dome-shaped) selectivity with geometric similarity
sel_normal_geom <- function(L, mesh, k, sig_ratio) {
  # k = L50/mesh ratio (constant across mesh sizes)
  # sig_ratio = sigma/mesh ratio (constant across mesh sizes)
  L50 <- k * mesh
  sigma <- sig_ratio * mesh
  exp(-0.5 * ((L - L50) / sigma)^2)
}

# Logistic (S-shaped) selectivity with geometric similarity
sel_logistic_geom <- function(L, mesh, k, SR_ratio) {
  # k = L50/mesh ratio (constant across mesh sizes)
  # SR_ratio = selection range/mesh ratio (constant across mesh sizes)
  L50 <- k * mesh
  SR <- SR_ratio * mesh
  1 / (1 + exp(-log(3) * 2 * (L - L50) / SR))
}

Prepare Data for SELECT Analysis

select_df <- dat %>%
  mutate(
    n = C1 + C2,           # Total caught at this length
    y = C2,                # Number caught by mesh2
    p_obs = y / n          # Observed proportion
  ) %>%
  filter(n >= 3)           # Minimum sample size for reliable proportions

cat(sprintf("  Length classes: %d\n", nrow(select_df)))
##   Length classes: 31
cat(sprintf("  Total fish: %d\n", sum(select_df$n)))
##   Total fish: 879

Visualize Observed Proportions

Before fitting, let’s look at what we’re trying to model:

p3 <- ggplot(select_df, aes(x = length, y = p_obs)) +
  geom_point(aes(size = n), alpha = 0.6, color = "#065A82") +
  geom_hline(yintercept = 0.5, linetype = "dashed", color = "gray50") +
  scale_size_continuous(name = "Sample\nsize", range = c(1, 5)) +
  coord_cartesian(ylim = c(0, 1)) +
  labs(
    title = "Observed Conditional Proportions",
    subtitle = paste0("P(caught by ", mesh2, "mm | caught by either mesh)"),
    x = "Length (cm)",
    y = "Proportion caught by larger mesh"
  )

print(p3)

Task: Describe the pattern you see. At what length does the proportion cross 0.5? How does this relate to \(L^*\) from the Holt method?

Define the Negative Log-Likelihood Function

This is the heart of the SELECT method. We maximize the likelihood of observing our data given the selectivity parameters.

nll_select <- function(par, data, mesh1, mesh2, sel_type = "normal") {
  
  # Extract parameters (on transformed scale for optimization)
  k <- exp(par[1])          # Ensures k > 0
  spread <- exp(par[2])     # sig_ratio or SR_ratio > 0
  alpha <- par[3]           # log(P2/P1) - can be any real number
  
  L <- data$length
  y <- data$y
  n <- data$n
  
  # Calculate selectivity for each mesh
  if (sel_type == "normal") {
    r1 <- sel_normal_geom(L, mesh1, k, spread)
    r2 <- sel_normal_geom(L, mesh2, k, spread)
  } else if (sel_type == "logistic") {
    r1 <- sel_logistic_geom(L, mesh1, k, spread)
    r2 <- sel_logistic_geom(L, mesh2, k, spread)
  }
  
  # Relative fishing power
  P2_P1 <- exp(alpha)
  
  # Conditional probability of capture by mesh2
  p <- (P2_P1 * r2) / (r1 + P2_P1 * r2)
  
  # Avoid numerical issues (log of 0 or 1)
  eps <- 1e-10
  p <- pmin(pmax(p, eps), 1 - eps)
  
  # Binomial negative log-likelihood
  -sum(dbinom(y, n, p, log = TRUE))
}

Fit the Normal (Dome-Shaped) Model

This model is appropriate for gillnets, where fish can be too small to be caught (mesh passes through) or too large (mesh doesn’t stretch enough).

# Starting values from Holt estimate
k_start <- L_star / sqrt(mesh1 * mesh2)
start_normal <- c(log(k_start), log(0.12), 0)

# Optimize
fit_normal <- optim(
  par = start_normal,
  fn = nll_select,
  data = select_df,
  mesh1 = mesh1,
  mesh2 = mesh2,
  sel_type = "normal",
  method = "BFGS",
  hessian = TRUE
)

# Check convergence
if (fit_normal$convergence != 0) {
  warning("Normal model did not converge!")
} else {
  cat("Normal model converged successfully.\n\n")
}
## Normal model converged successfully.
# Extract estimates (back-transform from log scale)
k_normal <- exp(fit_normal$par[1])
sig_ratio_normal <- exp(fit_normal$par[2])
alpha_normal <- fit_normal$par[3]

# Standard errors from Hessian (delta method approximation)
se_normal <- sqrt(diag(solve(fit_normal$hessian)))

# Display model results
cat(sprintf("  k (L50/mesh)      = %.4f (SE: %.4f)\n", k_normal, se_normal[1] * k_normal))
##   k (L50/mesh)      = 1.0392 (SE: 0.5866)
cat(sprintf("  sig_ratio         = %.4f (SE: %.4f)\n", sig_ratio_normal, se_normal[2] * sig_ratio_normal))
##   sig_ratio         = 0.1180 (SE: 0.0330)
cat(sprintf("  alpha (log P2/P1) = %.4f (SE: %.4f)\n\n", alpha_normal, se_normal[3]))
##   alpha (log P2/P1) = -0.4443 (SE: 9.6768)
# Derived parameters
cat(sprintf("  L50 (mesh1=%dmm) = %.1f cm\n", mesh1, k_normal * mesh1))
##   L50 (mesh1=80mm) = 83.1 cm
cat(sprintf("  L50 (mesh2=%dmm) = %.1f cm\n", mesh2, k_normal * mesh2))
##   L50 (mesh2=100mm) = 103.9 cm
cat(sprintf("  Selection range (mesh1) ≈ %.1f cm\n", 2 * sig_ratio_normal * mesh1))
##   Selection range (mesh1) ≈ 18.9 cm
cat(sprintf("  Selection range (mesh2) ≈ %.1f cm\n", 2 * sig_ratio_normal * mesh2))
##   Selection range (mesh2) ≈ 23.6 cm
cat(sprintf("  Relative fishing power P2/P1 = %.3f\n\n", exp(alpha_normal)))
##   Relative fishing power P2/P1 = 0.641
# AIC for model comparison
aic_normal <- 2 * fit_normal$value + 2 * length(fit_normal$par)
cat(sprintf("AIC (Normal): %.2f\n", aic_normal))
## AIC (Normal): 81.88

Fit the Logistic (S-Shaped) Model

This model is appropriate for trawls, where fish above a certain size are always retained.

# Starting values
start_logistic <- c(log(k_start), log(0.2), 0)

# Optimize
fit_logistic <- optim(
  par = start_logistic,
  fn = nll_select,
  data = select_df,
  mesh1 = mesh1,
  mesh2 = mesh2,
  sel_type = "logistic",
  method = "BFGS",
  hessian = TRUE
)

# Extract estimates
k_logistic <- exp(fit_logistic$par[1])
SR_ratio_logistic <- exp(fit_logistic$par[2])
alpha_logistic <- fit_logistic$par[3]

# Standard errors from Hessian
se_logistic <- sqrt(diag(solve(fit_logistic$hessian)))

# Display results
cat(sprintf("  k (L50/mesh)      = %.4f (SE: %.4f)\n", k_logistic, se_logistic[1] * k_logistic))
##   k (L50/mesh)      = 1.0493 (SE: 0.0127)
cat(sprintf("  SR_ratio          = %.4f (SE: %.4f)\n", SR_ratio_logistic, se_logistic[2] * SR_ratio_logistic))
##   SR_ratio          = 0.0800 (SE: 0.0039)
cat(sprintf("  alpha (log P2/P1) = %.4f (SE: %.4f)\n\n", alpha_logistic, se_logistic[3]))
##   alpha (log P2/P1) = 2.9385 (SE: 0.3057)
# Derived parameters
cat("Derived selectivity parameters:\n")
## Derived selectivity parameters:
cat(sprintf("  L50 (mesh1=%dmm) = %.1f cm\n", mesh1, k_logistic * mesh1))
##   L50 (mesh1=80mm) = 83.9 cm
cat(sprintf("  L50 (mesh2=%dmm) = %.1f cm\n", mesh2, k_logistic * mesh2))
##   L50 (mesh2=100mm) = 104.9 cm
cat(sprintf("  Selection range (mesh1) = %.1f cm\n", SR_ratio_logistic * mesh1))
##   Selection range (mesh1) = 6.4 cm
cat(sprintf("  Selection range (mesh2) = %.1f cm\n", SR_ratio_logistic * mesh2))
##   Selection range (mesh2) = 8.0 cm
cat(sprintf("  Relative fishing power P2/P1 = %.3f\n\n", exp(alpha_logistic)))
##   Relative fishing power P2/P1 = 18.888
# AIC
aic_logistic <- 2 * fit_logistic$value + 2 * length(fit_logistic$par)
cat(sprintf("AIC (Logistic): %.2f\n", aic_logistic))
## AIC (Logistic): 101.66

Model Comparison

We use AIC (Akaike Information Criterion) to compare models. Lower AIC indicates a better balance between fit and complexity.

cat(sprintf("  Normal (dome-shaped):  AIC = %.2f\n", aic_normal))
##   Normal (dome-shaped):  AIC = 81.88
cat(sprintf("  Logistic (S-shaped):   AIC = %.2f\n", aic_logistic))
##   Logistic (S-shaped):   AIC = 101.66
cat(sprintf("  ΔAIC = %.2f\n\n", abs(aic_normal - aic_logistic)))
##   ΔAIC = 19.78
if (aic_normal < aic_logistic) {
  cat(">> Best model: NORMAL (dome-shaped)\n")
  cat("   This is expected for gillnet-like selectivity.\n")
  best_type <- "normal"
} else {
  cat(">> Best model: LOGISTIC (S-shaped)\n")
  cat("   This suggests trawl-like selectivity.\n")
  best_type <- "logistic"
}
## >> Best model: NORMAL (dome-shaped)
##    This is expected for gillnet-like selectivity.

Task: Why would you expect gillnets to have dome-shaped selectivity while trawls have S-shaped selectivity? Think about the physical mechanisms of fish capture.


Part 5: Visualization and Diagnostics

Plot Fitted Selectivity Curves

# Generate length sequence for plotting
L_seq <- seq(min(dat$length), max(dat$length), by = 0.5)

# Calculate selectivity curves for best model
if (best_type == "normal") {
  sel1 <- sel_normal_geom(L_seq, mesh1, k_normal, sig_ratio_normal)
  sel2 <- sel_normal_geom(L_seq, mesh2, k_normal, sig_ratio_normal)
} else {
  sel1 <- sel_logistic_geom(L_seq, mesh1, k_logistic, SR_ratio_logistic)
  sel2 <- sel_logistic_geom(L_seq, mesh2, k_logistic, SR_ratio_logistic)
}

# Combine for plotting
sel_curves <- bind_rows(
  data.frame(length = L_seq, selectivity = sel1, mesh = factor(mesh1)),
  data.frame(length = L_seq, selectivity = sel2, mesh = factor(mesh2))
)

# Plot
p4 <- ggplot(sel_curves, aes(x = length, y = selectivity, color = mesh)) +
  geom_line(linewidth = 1.2) +
  scale_color_manual(values = c("#065A82", "#00A896"), name = "Mesh (mm)") +
  coord_cartesian(ylim = c(0, 1.05)) +
  labs(
    title = paste0("Fitted Selectivity Curves (", best_type, " model)"),
    x = "Length (cm)",
    y = "Relative selectivity r(L)"
  )

print(p4)

Residual Analysis

Good models should have residuals randomly scattered around zero with no systematic patterns.

# Calculate predicted probabilities for the best model
if (best_type == "normal") {
  r1_pred <- sel_normal_geom(select_df$length, mesh1, k_normal, sig_ratio_normal)
  r2_pred <- sel_normal_geom(select_df$length, mesh2, k_normal, sig_ratio_normal)
  P2_P1 <- exp(alpha_normal)
} else {
  r1_pred <- sel_logistic_geom(select_df$length, mesh1, k_logistic, SR_ratio_logistic)
  r2_pred <- sel_logistic_geom(select_df$length, mesh2, k_logistic, SR_ratio_logistic)
  P2_P1 <- exp(alpha_logistic)
}

# Calculate deviance residuals
select_df <- select_df %>%
  mutate(
    p_pred = (P2_P1 * r2_pred) / (r1_pred + P2_P1 * r2_pred),
    y_exp = n * p_pred,
    dev_resid = sign(y - y_exp) * sqrt(2 * (
      ifelse(y > 0, y * log(y / y_exp), 0) +
      ifelse(n - y > 0, (n - y) * log((n - y) / (n - y_exp)), 0)
    ))
  )

# Plot residuals
p5 <- ggplot(select_df, aes(x = length, y = dev_resid)) +
  geom_hline(yintercept = 0, color = "gray50") +
  geom_hline(yintercept = c(-2, 2), linetype = "dashed", color = "gray70") +
  geom_point(aes(size = n), alpha = 0.6, color = "#065A82") +
  scale_size_continuous(name = "Sample\nsize", range = c(1, 4)) +
  labs(
    title = "Deviance Residuals",
    subtitle = "Points should be randomly scattered between ±2",
    x = "Length (cm)",
    y = "Deviance residual"
  )

print(p5)

Interpreting residuals: Points outside the dashed lines (±2) suggest potential lack of fit at those lengths. Systematic patterns (e.g., all positive at small sizes, all negative at large sizes) would indicate model misspecification.


Part 6: Exercises and Discussion

Exercise 1: Interpreting L50

Based on our analysis:

  • The 80mm mesh has \(L_{50}\) = 83.1 cm
  • The 100mm mesh has \(L_{50}\) = 103.9 cm

Question: What do these values mean for:

  1. A fishery manager setting regulations?
  2. A fisher choosing gear?

Exercise 2: Minimum Landing Size

Suppose the length at 50% maturity (\(L_m\)) for this species is 90 cm.

Question: Which mesh size would better protect juveniles? Why?

Hint: Compare \(L_{50}\) of each mesh to \(L_m\). What proportion of immature fish would be retained by each mesh?

Exercise 3: Your Turn

Try modifying the analysis:

  1. Change the bin_width parameter to 1 cm or 4 cm. How does this affect your estimates?
  2. If you have your own data, replace the data file and rerun the analysis
  3. Add confidence intervals to the selectivity curves using the Hessian standard errors

Summary

Key Concepts

1. Holt Method (1963)

  • Quick, intuitive linear regression approach
  • Good for initial exploration and starting values
  • Limitations: Assumes linear log-ratio, no proper error model

2. SELECT Method (Millar & Fryer 1999)

  • Gold standard for selectivity estimation
  • Proper statistical likelihood framework (binomial)
  • Allows different curve shapes (normal, logistic, etc.)
  • Provides standard errors and enables model comparison

3. Key Parameters

Parameter Description
\(L_{50}\) Length at 50% (or maximum) selectivity
Selection range Width of the selectivity curve (~\(2\sigma\) for normal)
Relative fishing power (\(P_2/P_1\)) Efficiency ratio between gears

4. Geometric Similarity

Under this assumption, selectivity scales with mesh size:

  • \(L_{50} = k \times \text{mesh}\)
  • Selection range = \(\sigma_{ratio} \times \text{mesh}\)

This allows us to estimate a single set of parameters (\(k\), \(\sigma_{ratio}\)) that apply to any mesh size.


## Great work! - You have fitted your first size selectivity models