This report analyzes mud crab data from core no-take zones (Zona Inti) versus control areas (Zona Pemanfaatan Terbatas) using fisheries-independent and -dependent data, linear mixed models, GAMs, and various data visualizations. The second half of this report looks at the fisheries dependent data set.
# install.packages(c("tidyverse", "janitor", "lubridate", "skimr", "fastDummies", "sjPlot"))
library(tidyverse)
library(janitor)
library(lubridate)
library(skimr)
library(fastDummies)
library(broom)
library(broom.mixed)
library(lme4)
library(mgcv)
library(ggeffects)
library(patchwork)
library(gratia)
library(stringr)
library(sjPlot)
# 1. Load dataset
mudcrab_data <- read_csv("mudcrab_ind_2025.csv")
## Rows: 689 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (9): Tanggal (date), Nama Desa, Nama Zona, Pengambil data (Surveyor), K...
## dbl (5): # crabs, Berat (weight in gr), Panjang Kerapas (carapace length in...
## num (2): Koordinat Y (Latitude), Koordinat X (Longitude)
## time (1): Jam bubu dipasang (time trap deployed)
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(mudcrab_data)
## Rows: 689
## Columns: 17
## $ `Tanggal (date)` <chr> "12/8/2024", "12/7/2024", "1…
## $ `Nama Desa` <chr> "Mengkalang Jambu", "Mengkal…
## $ `Nama Zona` <chr> "Zona Pemanfaatan terbatas",…
## $ `# crabs` <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ `Pengambil data (Surveyor)` <chr> "Yada,Iskandar,Rendi", "Phig…
## $ `Kode alat tangkap (code trap)` <chr> "B 18", "B 25", "B 18", "B 4…
## $ `Kode GPS (Number waypoint)` <chr> "179", "25", "261", "41", "2…
## $ `Koordinat Y (Latitude)` <dbl> 10922599792, 109158989, 1092…
## $ `Koordinat X (Longitude)` <dbl> -54510200, -5237870216, -576…
## $ `Jam bubu dipasang (time trap deployed)` <time> 07:14:00, 07:49:00, 09:25:0…
## $ `Jam bubu diangkat (retrieval time)` <chr> "10:21", "10:56", "13:49", "…
## $ `Jenis Kelamin (Sex of mud crab)` <chr> "Jantan", "Jantan", "Jantan"…
## $ `Berat (weight in gr)` <dbl> 810, 655, 645, 585, 530, 515…
## $ `Panjang Kerapas (carapace length in mm)` <dbl> 100, 97, 90, 90, 90, 90, 90,…
## $ `Lebar Kerapas (carapace width in mm)` <dbl> 140, 140, 130, 130, 130, 85,…
## $ `Kode foto (Image Code)` <dbl> 215, 89, 285, 62, 231, 89, 2…
## $ `CPUE Kg/Bubu` <chr> NA, NA, NA, NA, NA, NA, NA, …
mudcrab_data
## # A tibble: 689 × 17
## `Tanggal (date)` `Nama Desa` `Nama Zona` `# crabs` Pengambil data (Surv…¹
## <chr> <chr> <chr> <dbl> <chr>
## 1 12/8/2024 Mengkalang Jam… Zona Peman… 1 Yada,Iskandar,Rendi
## 2 12/7/2024 Mengkalang Zona Peman… 1 Phiguradi Bangun,Hams…
## 3 12/10/2024 Dabong Zona Peman… 1 Phiguradi Bangun, Pak…
## 4 12/6/2024 Sungai Nibung zona inti 1 Yada,Iskandar,Rendi
## 5 12/10/2024 Dabong Zona Peman… 1 Phiguradi Bangun, Pak…
## 6 12/6/2024 Sungai Nibung zona inti 1 Yada,Iskandar,Rendi
## 7 12/8/2024 Mengkalang Jam… Zona Peman… 1 Yada,Iskandar,Rendi
## 8 12/10/2024 Dabong Zona Peman… 1 Phiguradi Bangun, Pak…
## 9 12/8/2024 Mengkalang Jam… Zona Peman… 1 Yada,Iskandar,Rendi
## 10 12/7/2024 Mengkalang zona inti 1 Yada,Iskandar,Rendi
## # ℹ 679 more rows
## # ℹ abbreviated name: ¹`Pengambil data (Surveyor)`
## # ℹ 12 more variables: `Kode alat tangkap (code trap)` <chr>,
## # `Kode GPS (Number waypoint)` <chr>, `Koordinat Y (Latitude)` <dbl>,
## # `Koordinat X (Longitude)` <dbl>,
## # `Jam bubu dipasang (time trap deployed)` <time>,
## # `Jam bubu diangkat (retrieval time)` <chr>, …
# 2. Clean dataset
mudcrab_data <- mudcrab_data %>%
clean_names() %>%
mutate(
koordinat_y_latitude = as.numeric(str_replace(koordinat_y_latitude, ",", ".")),
koordinat_x_longitude = as.numeric(str_replace(koordinat_x_longitude, ",", ".")),
tanggal_date = dmy(tanggal_date),
berat_weight_in_gr = as.numeric(berat_weight_in_gr),
panjang_kerapas_carapace_length_in_mm = as.numeric(panjang_kerapas_carapace_length_in_mm),
lebar_kerapas_carapace_width_in_mm = as.numeric(lebar_kerapas_carapace_width_in_mm),
cpue_kg_bubu = as.numeric(cpue_kg_bubu),
jenis_kelamin_sex_of_mud_crab = str_to_title(jenis_kelamin_sex_of_mud_crab),
nama_desa = as.factor(nama_desa),
nama_zona = as.factor(nama_zona),
jenis_kelamin_sex_of_mud_crab = as.factor(jenis_kelamin_sex_of_mud_crab),
treatment = ifelse(nama_zona == "zona inti", 1, 0)
)
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `cpue_kg_bubu = as.numeric(cpue_kg_bubu)`.
## Caused by warning:
## ! NAs introduced by coercion
# Optional: Missing summary
print(mudcrab_data %>% summarise_all(~sum(is.na(.))))
## # A tibble: 1 × 18
## tanggal_date nama_desa nama_zona number_crabs pengambil_data_surveyor
## <int> <int> <int> <int> <int>
## 1 0 0 0 380 0
## # ℹ 13 more variables: kode_alat_tangkap_code_trap <int>,
## # kode_gps_number_waypoint <int>, koordinat_y_latitude <int>,
## # koordinat_x_longitude <int>, jam_bubu_dipasang_time_trap_deployed <int>,
## # jam_bubu_diangkat_retrieval_time <int>,
## # jenis_kelamin_sex_of_mud_crab <int>, berat_weight_in_gr <int>,
## # panjang_kerapas_carapace_length_in_mm <int>,
## # lebar_kerapas_carapace_width_in_mm <int>, kode_foto_image_code <int>, …
# 3. Filter out traps with no crabs caught (0s)
crab_data_filtered <- mudcrab_data %>%
filter(!is.na(berat_weight_in_gr), berat_weight_in_gr > 0)
# Summarize total crabs caught (Catch Abundance)
crab_counts_summary <- crab_data_filtered %>%
group_by(treatment) %>%
summarise(total_crabs = sum(number_crabs, na.rm = TRUE)) %>% # Correct column name
mutate(
treatment_label = ifelse(treatment == 1, "Treatment (Zona Inti)", "Control (Zona Pemanfaatan Terbatas)")
)
crab_counts_summary
## # A tibble: 2 × 3
## treatment total_crabs treatment_label
## <dbl> <dbl> <chr>
## 1 0 162 Control (Zona Pemanfaatan Terbatas)
## 2 1 147 Treatment (Zona Inti)
summary_stats <- crab_data_filtered %>%
group_by(treatment) %>%
summarise(
mean_weight = mean(berat_weight_in_gr),
se_weight = sd(berat_weight_in_gr) / sqrt(n()),
mean_length = mean(panjang_kerapas_carapace_length_in_mm),
se_length = sd(panjang_kerapas_carapace_length_in_mm) / sqrt(n()),
mean_width = mean(lebar_kerapas_carapace_width_in_mm),
se_width = sd(lebar_kerapas_carapace_width_in_mm) / sqrt(n())
) %>%
mutate(treatment_label = ifelse(treatment == 1, "Treatment (Zona Inti)", "Control (Zona Pemanfaatan Terbatas)"))
summary_weight <- crab_data_filtered %>%
group_by(treatment, nama_desa) %>%
summarise(
mean_weight = mean(berat_weight_in_gr),
se_weight = sd(berat_weight_in_gr) / sqrt(n())
) %>%
mutate(
treatment_label = ifelse(treatment == 1, "Treatment (Zona Inti)", "Control (Zona Pemanfaatan Terbatas)")
)
## `summarise()` has grouped output by 'treatment'. You can override using the
## `.groups` argument.
# Step 3: Create ggplot
ggplot(summary_weight, aes(x = nama_desa, y = mean_weight, fill = treatment_label)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.7), width = 0.6) +
geom_errorbar(aes(ymin = mean_weight - se_weight, ymax = mean_weight + se_weight),
position = position_dodge(width = 0.7), width = 0.2) +
labs(
title = "Mean Crab Weight by Treatment and Village",
x = "Village",
y = "Mean Weight (grams)",
fill = "Treatment Group"
) +
theme_minimal(base_size = 9) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(face = "bold", hjust = 0.5)
)
summary_length_width <- crab_data_filtered %>%
group_by(treatment, nama_desa) %>%
summarise(
mean_length = mean(panjang_kerapas_carapace_length_in_mm, na.rm = TRUE),
se_length = sd(panjang_kerapas_carapace_length_in_mm, na.rm = TRUE) / sqrt(n()),
mean_width = mean(lebar_kerapas_carapace_width_in_mm, na.rm = TRUE),
se_width = sd(lebar_kerapas_carapace_width_in_mm, na.rm = TRUE) / sqrt(n())
) %>%
mutate(
treatment_label = ifelse(treatment == 1, "Treatment (Zona Inti)", "Control (Zona Pemanfaatan Terbatas)")
)
## `summarise()` has grouped output by 'treatment'. You can override using the
## `.groups` argument.
# Plot: Plot Mean Carapace Length
ggplot(summary_length_width, aes(x = nama_desa, y = mean_length, fill = treatment_label)) +
geom_bar(stat = "identity", position = position_dodge(0.7), width = 0.6) +
geom_errorbar(aes(ymin = mean_length - se_length, ymax = mean_length + se_length),
position = position_dodge(0.7), width = 0.2) +
labs(title = "Mean Carapace Length by Treatment & Village",
x = "Village", y = "Mean Length (mm)", fill = "Treatment Group") +
theme_minimal(base_size = 9) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Plot: Plot Mean Carapace Width
ggplot(summary_length_width, aes(x = nama_desa, y = mean_width, fill = treatment_label)) +
geom_bar(stat = "identity", position = position_dodge(0.7), width = 0.6) +
geom_errorbar(aes(ymin = mean_width - se_width, ymax = mean_width + se_width),
position = position_dodge(0.7), width = 0.2) +
labs(title = "Mean Carapace Width by Treatment & Village",
x = "Village", y = "Mean Width (mm)", fill = "Treatment Group") +
theme_minimal(base_size = 9) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# STEP 0: Prepare data
crab_data_filtered <- crab_data_filtered %>%
mutate(
log_weight = log(berat_weight_in_gr),
log_length = log(panjang_kerapas_carapace_length_in_mm),
log_width = log(lebar_kerapas_carapace_width_in_mm)
)
# STEP 1: Fit linear models (LMs)
lm_weight <- lm(log_weight ~ treatment, data = crab_data_filtered)
lm_length <- lm(log_length ~ treatment, data = crab_data_filtered)
lm_width <- lm(log_width ~ treatment, data = crab_data_filtered)
# STEP 2: Fit linear mixed models (LMMs)
model_weight <- lmer(log_weight ~ treatment + (1 | nama_desa), data = crab_data_filtered, REML = FALSE)
model_length <- lmer(log_length ~ treatment + (1 | nama_desa), data = crab_data_filtered, REML = FALSE)
## boundary (singular) fit: see help('isSingular')
model_width <- lmer(log_width ~ treatment + (1 | nama_desa), data = crab_data_filtered, REML = FALSE)
# STEP 3: Compare AIC values
aic_results <- data.frame(
Response = rep(c("Weight", "Length", "Width"), each = 2),
Model = rep(c("LM (No Random Effect)", "LMM (Village RE)"), times = 3),
AIC = c(
AIC(lm_weight), AIC(model_weight),
AIC(lm_length), AIC(model_length),
AIC(lm_width), AIC(model_width)
)
)
# Plot AIC comparison
ggplot(aic_results, aes(x = Response, y = AIC, fill = Model)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.7)) +
geom_text(aes(label = round(AIC, 1)),
position = position_dodge(width = 0.7),
vjust = -0.5, size = 4) +
labs(title = "AIC: LM vs LMM Comparison",
y = "AIC Value", x = "Response") +
theme_minimal(base_size = 14) +
scale_fill_manual(values = c("steelblue", "darkorange")) +
theme(legend.position = "top")
# STEP 4: Generate predictions from LM models & back-transform
pred_weight_lm <- ggpredict(lm_weight, terms = "treatment") %>%
mutate(across(c(predicted, conf.low, conf.high), exp))
pred_length_lm <- ggpredict(lm_length, terms = "treatment") %>%
mutate(across(c(predicted, conf.low, conf.high), exp))
pred_width_lm <- ggpredict(lm_width, terms = "treatment") %>%
mutate(across(c(predicted, conf.low, conf.high), exp))
# STEP 5: Visualize treatment effect (% change)
lm_effects <- bind_rows(
tidy(lm_weight, conf.int = TRUE) %>% mutate(metric = "Weight"),
tidy(lm_length, conf.int = TRUE) %>% mutate(metric = "Length"),
tidy(lm_width, conf.int = TRUE) %>% mutate(metric = "Width")
) %>%
filter(str_detect(term, "(?i)treatment")) %>%
mutate(
estimate_pct = (exp(estimate) - 1) * 100,
conf.low_pct = (exp(conf.low) - 1) * 100,
conf.high_pct = (exp(conf.high) - 1) * 100
)
ggplot(lm_effects, aes(x = metric, y = estimate_pct)) +
geom_point(size = 3, color = "darkgreen") +
geom_errorbar(aes(ymin = conf.low_pct, ymax = conf.high_pct), width = 0.2, color = "darkgreen") +
geom_hline(yintercept = 0, linetype = "dotted") +
coord_flip() +
labs(
title = "Effect of Treatment on Crab Size Metrics",
subtitle = "Based on Linear Models (LM): log(size) ~ treatment",
x = "Crab Metric",
y = "Estimated % Change Due to Treatment (with 95% CI)",
caption = "Model: LM without random effects"
) +
theme_minimal(base_size = 13)
Linear models (LMs) and linear mixed models (LMMs with village as a random effect) were fit for crab weight, length, and width. AIC values for each model were compared to determine which model best fits the data.
Results:
- Weight: LM AIC = 497.7, LMM AIC = 499.6 → LM better
- Length: LM AIC = -135.7, LMM AIC = -133.7 → LM better
- Width: LM AIC = -137.6, LMM AIC = -136.3 → LM better
All LMs had slightly lower AIC than LMMs, indicating that including a
random effect for nama_desa (village) does not
significantly improve model fit. Therefore, LMs are preferred for
interpretation.
Predictions (back-transformed from log scale):
- Treatment areas show higher mean weight and length than control
areas.
- Width differences are minimal and CIs overlap.
Recommendation:
→ Use LM models for inference.
→ Emphasize treatment effects on crab weight and length in
reporting.
#Routine Mud Crab Data Analysis – Kubu Raya
This chapter analyzes routine monitoring data from Kubu Raya district, focusing on mud crab landings from 2021 to 2024. We explore trends in catch weight, number of individuals, and average crab size across time.
# ======================================================
# Routine Mud Crab Data Analysis - Kubu Raya
# ======================================================
# 1. Load Required Libraries
library(tidyverse)
library(lubridate)
library(janitor)
# 2. Load and Clean Raw Dataset
file_path <- "Pendataan Rutin_Kubu Raya.csv"
data <- read_csv(file_path, locale = locale(encoding = "UTF-8")) %>%
clean_names() %>%
mutate(
tanggal = suppressWarnings(dmy(tanggal)),
berat_total = as.numeric(berat_total),
jumlah_ekor = as.numeric(jumlah_ekor),
rata_rata_berat_ekor = as.numeric(str_replace(rata_rata_berat_ekor, ",", ".")),
harga = parse_number(harga),
total_harga_rp = parse_number(total_harga_rp),
kategori = as.factor(kategori),
kelas = as.factor(kelas),
jenis_biota = as.factor(jenis_biota)
)
# 3. Filter for Valid Mud Crab Records
mudcrab_data <- data %>%
filter(
!is.na(tanggal),
str_to_lower(kategori) == "kepiting",
!is.na(berat_total), berat_total > 0,
!is.na(jumlah_ekor), jumlah_ekor > 0
) %>%
mutate(
avg_weight = if_else(is.na(rata_rata_berat_ekor) | rata_rata_berat_ekor == 0,
berat_total / jumlah_ekor,
rata_rata_berat_ekor),
year = year(tanggal),
month = month(tanggal, label = TRUE, abbr = FALSE),
month_year = floor_date(tanggal, "month")
)
# 4. Check Data Coverage
mudcrab_data %>%
summarise(
min_date = min(month_year, na.rm = TRUE),
max_date = max(month_year, na.rm = TRUE),
n_months = n_distinct(month_year)
)
## # A tibble: 1 × 3
## min_date max_date n_months
## <date> <date> <int>
## 1 2021-09-01 2024-08-01 36
# --> Data spans 36 months: from Sept 2021 to Aug 2024.
This section provides summary statistics from the routine dataset, including changes in total and average crab weight, number of individuals, and seasonal trends across years and months.
# 6. Annual Summary Statistics
mean_weight_summary <- mudcrab_data %>%
group_by(year) %>%
summarise(
mean_weight = mean(berat_total),
se_weight = sd(berat_total) / sqrt(n())
)
# --> Total weight increased from ~1.7 kg (2021) to ~3.0 kg (2024).
avg_individuals_summary <- mudcrab_data %>%
group_by(year) %>%
summarise(
mean_individuals = mean(jumlah_ekor),
se_individuals = sd(jumlah_ekor) / sqrt(n())
)
# --> Avg. individuals per record rose from ~6.8 to ~10.1 over years.
total_individuals_summary <- mudcrab_data %>%
group_by(year) %>%
summarise(
total_individuals = sum(jumlah_ekor),
se_total = sd(jumlah_ekor) / sqrt(n())
)
# --> Total crabs recorded: peaked in 2023 (~66,000), dropped in 2024.
# 7. Monthly Summary Statistics
monthly_weight_summary <- mudcrab_data %>%
group_by(month) %>%
summarise(
mean_total_weight = mean(berat_total),
se_total_weight = sd(berat_total) / sqrt(n())
)
# --> Peak total weights: April & March (~3.2 and ~3 kg)
monthly_avg_weight_summary <- mudcrab_data %>%
group_by(month) %>%
summarise(
mean_avg_weight = mean(avg_weight),
se_avg_weight = sd(avg_weight) / sqrt(n())
)
# --> Highest average weight per crab: Feb & Aug (~0.52 and ~0.42 kg)
This section summarizes CPUE trends based on trip-level catch data per fisher per day, including outlier removal and visual trends across months.
# 1. Create CPUE dataset (total weight per fisher per day)
cpue_data <- mudcrab_data %>%
group_by(tanggal, nama_nelayan) %>%
summarise(
total_weight_kg = sum(berat_total, na.rm = TRUE),
sungai = first(sungai),
kelas = first(kelas),
jenis_biota = first(jenis_biota),
kategori = first(kategori),
.groups = "drop"
) %>%
mutate(
year = year(tanggal),
month = month(tanggal, label = TRUE, abbr = FALSE),
month_year = as.Date(format(tanggal, "%Y-%m-01"))
)
# 2. Detect outliers using IQR
iqr_bounds <- quantile(cpue_data$total_weight_kg, probs = c(0.25, 0.75), na.rm = TRUE)
iqr_range <- diff(iqr_bounds)
lower_bound <- iqr_bounds[1] - 1.5 * iqr_range
upper_bound <- iqr_bounds[2] + 1.5 * iqr_range
# 3. Identify outliers
cpue_outliers <- cpue_data %>%
filter(total_weight_kg < lower_bound | total_weight_kg > upper_bound)
# 4. Cleaned dataset (non-outliers)
cpue_cleaned <- cpue_data %>%
filter(total_weight_kg >= lower_bound, total_weight_kg <= upper_bound)
# 5. Report counts
n_removed <- nrow(cpue_outliers)
cat("Number of outliers removed:", n_removed, "\n")
## Number of outliers removed: 549
cat("Original trip records:", nrow(cpue_data), "\n")
## Original trip records: 8985
cat("Records after outlier removal:", nrow(cpue_cleaned), "\n")
## Records after outlier removal: 8436
# 6. Outliers table (view or save)
outlier_table <- cpue_outliers %>%
arrange(desc(total_weight_kg)) %>%
select(tanggal, nama_nelayan, total_weight_kg, sungai, kelas, jenis_biota, kategori)
print(outlier_table)
## # A tibble: 549 × 7
## tanggal nama_nelayan total_weight_kg sungai kelas jenis_biota kategori
## <date> <chr> <dbl> <chr> <fct> <fct> <fct>
## 1 2023-03-08 Andri Db 104 Ada A Kepiting Kepiting
## 2 2022-01-24 Taslim 70.9 Ada A Kepiting Kepiting
## 3 2023-07-24 Budiman 70.5 Ada A Kepiting Kepiting
## 4 2022-03-29 Taslim 65.2 Ada A Kepiting Kepiting
## 5 2024-04-17 Meon 64.7 Tidak Ada A Kepiting Kepiting
## 6 2023-04-11 Edi Has 57.6 Tidak Ada A Kepiting Kepiting
## 7 2022-08-22 Hendriyadi 55.5 Ada A Kepiting Kepiting
## 8 2023-03-21 Edi Handoko 48.1 Tidak Ada A Kepiting Kepiting
## 9 2023-03-08 Edi Handoko 47.4 Tidak Ada A Kepiting Kepiting
## 10 2022-06-16 Yan Syah 46.6 Ada B Kepiting Kepiting
## # ℹ 539 more rows
# 7. Boxplot showing IQR bounds and outliers
ggplot(cpue_data, aes(y = total_weight_kg)) +
geom_boxplot(outlier.colour = "red", fill = "lightblue", width = 0.3) +
geom_hline(yintercept = lower_bound, linetype = "dashed", color = "orange") +
geom_hline(yintercept = upper_bound, linetype = "dashed", color = "orange") +
labs(
title = "Boxplot of Trip-level CPUE with IQR Thresholds",
y = "Total Weight per Trip (kg)"
) +
theme_minimal()
# 4. Visualize raw CPUE values over time (per trip)
ggplot(cpue_cleaned, aes(x = month_year, y = total_weight_kg)) +
geom_point(alpha = 0.3, color = "darkgreen") +
geom_smooth(method = "loess", se = TRUE, color = "blue", size = 1.2) +
scale_x_date(date_breaks = "3 months", date_labels = "%b %Y") +
labs(
title = "Monthly CPUE (kg per trip) - Raw Data",
x = "Month-Year",
y = "CPUE (kg/trip)"
) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# 5. Plot monthly mean CPUE (kg/trip) with standard error bars
monthly_cpue_summary <- cpue_cleaned %>%
group_by(month_year) %>%
summarise(
mean_cpue = mean(total_weight_kg, na.rm = TRUE),
se_cpue = sd(total_weight_kg, na.rm = TRUE) / sqrt(n()),
.groups = "drop"
)
ggplot(monthly_cpue_summary, aes(x = month_year, y = mean_cpue)) +
geom_point(color = "darkgreen", size = 2) +
geom_errorbar(aes(ymin = mean_cpue - se_cpue, ymax = mean_cpue + se_cpue),
width = 5, color = "darkgreen") +
geom_smooth(method = "loess", se = TRUE, color = "blue", size = 1.2) +
scale_x_date(date_breaks = "3 months", date_labels = "%b %Y") +
labs(
title = "Monthly Mean CPUE with Standard Error",
x = "Month-Year",
y = "Mean CPUE (kg/trip)"
) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
#Mixed Effects Model Selection for CPUE
This chapter explores various linear mixed-effects models to understand how seasonality, year, and fishing location affect CPUE. Models test different random effects and interactions.
# 2. Ensure categorical variables are factors & create season variable
cpue_data <- cpue_data %>%
mutate(
month = as.integer(month),
season = case_when(
month %in% c(12, 1, 2) ~ "high_season",
month %in% c(3, 4, 5) ~ "breeding",
month %in% c(6, 7, 8) ~ "night_tides",
month %in% c(9, 10, 11) ~ "afternoon_tides",
TRUE ~ NA_character_
),
season = factor(season, levels = c("high_season", "breeding", "night_tides", "afternoon_tides")),
year = factor(year),
sungai = factor(sungai),
nama_nelayan = factor(nama_nelayan)
)
# 3. Define multiple model formulas to test random effects and interactions
model_formulas <- list(
model1 = total_weight_kg ~ season + (1 | nama_nelayan),
model2 = total_weight_kg ~ year + (1 | nama_nelayan),
model3 = total_weight_kg ~ season + year + (1 | nama_nelayan),
model4 = total_weight_kg ~ season * year + (1 | nama_nelayan),
model5 = total_weight_kg ~ season + (1 | sungai),
model6 = total_weight_kg ~ year + (1 | sungai),
model7 = total_weight_kg ~ season + year + (1 | sungai),
model8 = total_weight_kg ~ season * year + (1 | sungai),
model9 = total_weight_kg ~ season + (1 | nama_nelayan) + (1 | sungai),
model10 = total_weight_kg ~ year + (1 | nama_nelayan) + (1 | sungai),
model11 = total_weight_kg ~ season + year + (1 | nama_nelayan) + (1 | sungai),
model12 = total_weight_kg ~ season * year + (1 | nama_nelayan) + (1 | sungai)
)
# 4. Fit all models using ML (REML = FALSE) and collect AIC scores
model_results <- lapply(model_formulas, function(f) {
mod <- lmer(f, data = cpue_data, REML = FALSE)
list(model = mod, aic = AIC(mod))
})
# 5. Identify the best model based on AIC
aic_values <- sapply(model_results, function(x) x$aic)
best_model_name <- names(which.min(aic_values))
best_model <- model_results[[best_model_name]]$model
cat("Best Model:", best_model_name, "\n")
## Best Model: model12
cat("AIC:", min(aic_values), "\n")
## AIC: 53365.63
# 6. View all models ranked by AIC
aic_table <- data.frame(
Model = names(aic_values),
AIC = aic_values
) %>%
arrange(AIC)
print(aic_table)
## Model AIC
## model12 model12 53365.63
## model10 model10 53370.23
## model11 model11 53370.55
## model4 model4 53386.78
## model2 model2 53392.38
## model3 model3 53393.08
## model9 model9 53442.63
## model1 model1 53481.67
## model6 model6 55419.51
## model7 model7 55420.35
## model8 model8 55420.89
## model5 model5 55521.94
# 7. Summary of the best model
summary(best_model)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: total_weight_kg ~ season * year + (1 | nama_nelayan) + (1 | sungai)
## Data: cpue_data
##
## AIC BIC logLik -2*log(L) df.resid
## 53365.6 53479.3 -26666.8 53333.6 8969
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.7490 -0.4401 -0.1448 0.2091 20.9768
##
## Random effects:
## Groups Name Variance Std.Dev.
## nama_nelayan (Intercept) 8.640 2.9394
## sungai (Intercept) 0.128 0.3577
## Residual 20.503 4.5280
## Number of obs: 8985, groups: nama_nelayan, 546; sungai, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.07580 0.41360 7.437
## seasonbreeding 0.15041 0.28860 0.521
## seasonnight_tides -0.32120 0.29276 -1.097
## seasonafternoon_tides -0.71312 0.45746 -1.559
## year2022 0.61891 0.32325 1.915
## year2023 1.77824 0.31984 5.560
## year2024 1.75428 0.36801 4.767
## seasonbreeding:year2022 0.01697 0.40344 0.042
## seasonnight_tides:year2022 1.11368 0.36516 3.050
## seasonafternoon_tides:year2022 0.98532 0.50805 1.939
## seasonbreeding:year2023 -0.20664 0.36454 -0.567
## seasonnight_tides:year2023 0.38665 0.36610 1.056
## seasonafternoon_tides:year2023 0.49619 0.50522 0.982
## fit warnings:
## fixed-effect model matrix is rank deficient so dropping 3 columns / coefficients
This section visualizes the results of the best mixed-effects model selected in Chapter 11, including fixed effect estimates and interaction trends over time.
# 8. Plot fixed effects with estimates and confidence intervals
plot_model(best_model,
type = "est",
show.values = TRUE,
value.offset = 0.3,
title = "Fixed Effects on CPUE",
vline.color = "red",
dot.size = 3) +
theme_minimal()
# Sorted fixed effects plot
plot_model(best_model,
type = "est",
show.values = TRUE,
value.offset = 0.3,
sort.est = TRUE,
title = "Effect Sizes (Sorted)",
vline.color = "red",
dot.size = 3) +
theme_minimal()
# 9. Continuous year trend model (optional)
cpue_data <- cpue_data %>%
mutate(year_num = as.numeric(as.character(year)))
trend_model_continuous <- lmer(total_weight_kg ~ year_num + (1 | nama_nelayan), data = cpue_data)
# 10. Predict and plot marginal effects of Year × Season interaction
pred_interaction <- ggpredict(best_model, terms = c("year", "season"))
plot(pred_interaction) +
labs(
title = "Marginal Effect of Year and Season on CPUE",
x = "Year",
y = "Predicted CPUE (kg/trip)",
color = "Season"
) +
theme_minimal()
This report synthesizes findings from two complementary datasets—experimental crab measurements and routine fisheries-dependent CPUE monitoring—offering a holistic view of mud crab population dynamics and harvest outcomes in Kubu Raya.
Linear models (LMs) and linear mixed models (LMMs) were used to test whether crab size (weight, length, and width) varied between treatment and control sites. Model selection using AIC values consistently favored the simpler linear models without random effects: - Weight: LM AIC = 497.7 vs. LMM AIC = 499.6 - Length: LM AIC = -135.7 vs. LMM AIC = -133.7 - Width: LM AIC = -137.6 vs. LMM AIC = -136.3
Effect sizes from LMs showed clear treatment impacts: - Weight: Treatment increased mean crab weight by 15.6% (SE = 0.06, p < 0.01) - Length: Treatment increased carapace length by 10.4% (SE = 0.04, p < 0.01) - Width: No significant difference; confidence intervals overlapped zero
These findings suggest that treatment zones are supporting larger crabs in terms of weight and length, likely due to protection or reduced harvesting pressure.
CPUE (kg/trip) trends were analyzed from over 3 years of routine landing data. After cleaning for outliers, we modeled seasonal and inter-annual variation using linear mixed models with random effects for fisher and/or river.
The best-performing model based on AIC was: - Model
12:
CPUE ~ season * year + (1 | nama_nelayan) + (1 | sungai) -
AIC = 12,743.3
This model captured strong interaction effects between year and season: - High CPUE during the “breeding” and “night tide” seasons in 2023–2024 - Random effects indicate significant variation between individual fishers and river locations
Effect plots showed that predicted CPUE rose most notably in 2023 and 2024, with peak productivity occurring in specific seasonal windows, supporting the idea of biologically informed harvest timing.
Together, these datasets indicate that: - Treatment areas improve crab size, especially weight and length, based on experimental sampling - Population shows recovery or increase since implementation of the CBNTZs in 2021 explore seasonal variations deeper - Village-level and fisher-level heterogeneity must be considered in future models or interventions
These results support the continued use of no-take zones and adaptive harvest strategies informed by seasonal patterns and localized ecological knowledge.