Studies on a rare butterfly were conducted in 4 areas of Borneo: East and West Kalimantan (Indonesia), Sabah and Sarawak (Malaysia) (figure 1)
Figure 1: location of rare butterfly surveys. Top, locator box with red box showing location of Borneo. Bottom: location of 4 survey areas on the island of Borneo. West and East Kalimantan in Indonesia and Sabah and Sarawak in Malaysia.
Surveys were conducted weekly in each area for 104 weeks (2 years). With the adult butterfly having a life expectancy of one week, each week was considered a new generation. Data has been compiled into two data sets (abundance and resilience).
Two hypothesis are to be tested:
Resilience is measured by comparing the per capita growth rate from
one generation to the next and abundance. A population is considered
resilient if the population growth rate is low at high abundance (owinig
to factors such as density dependence and inter-specific competition)
and high at low abundance.
Firstly, set up the data for analysis.
including loading libraries required for analysis
setting site colours to ensure consistent graphs.
Load datasets for resilience and abundance
abundance <- read.csv ("/Users/beckytooby/Library/CloudStorage/OneDrive-SwanseaUniversity/3. Year Three/BIO334_Data Analysis/BIO334_CW2/j.c.bull_15112024_butterfly_counts.csv")
resilience <- read.csv ("/Users/beckytooby/Library/CloudStorage/OneDrive-SwanseaUniversity/3. Year Three/BIO334_Data Analysis/BIO334_CW2/j.c.bull_15112024_butterfly_dynamics.csv")
Load required libraries (and install if required)
# List of required packages
required_packages <- c("ggplot2", "dplyr", "tidyverse", "RColorBrewer", "dunn.test", "report")
# Check and install missing packages
install_if_missing <- function(packages) {
for (pkg in packages) {
if (!requireNamespace(pkg, quietly = TRUE)) {
install.packages(pkg)}
}
}
# Install packages that are missing
install_if_missing(required_packages)
# Load required libraries
lapply(required_packages, library, character.only = TRUE)
Set colours for all graphs to ensure consistency:
# Generate colours using the RColorBrewer Set 2 palette
site_colours <- brewer.pal(4, "Set2")
names(site_colours) <- c("East.Kalimantan", "Sabah", "Sarawak", "West.Kalimantan")
This dataset will be used to determine:
Key Questions:1) What is the expected model formula?
2) What is the expected relationship between response and explanatory variables?
3) What distribution best describes the data?
|
Firstly, inspect the data:
str(abundance) # inspect data
## 'data.frame': 104 obs. of 5 variables:
## $ Week : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sabah : int 11 9 10 10 10 12 16 13 14 5 ...
## $ Sarawak : int 110 99 103 117 132 117 137 114 100 98 ...
## $ West.Kalimantan: int 12 16 17 13 13 4 14 20 16 16 ...
## $ East.Kalimantan: int 8 3 11 11 9 12 14 10 13 10 ...
names(abundance) # check headers
## [1] "Week" "Sabah" "Sarawak" "West.Kalimantan"
## [5] "East.Kalimantan"
summary(abundance) # summarise data
## Week Sabah Sarawak West.Kalimantan
## Min. : 1.00 Min. : 4.00 Min. : 10.00 Min. : 4.0
## 1st Qu.: 26.75 1st Qu.: 8.00 1st Qu.: 22.00 1st Qu.:12.0
## Median : 52.50 Median :10.00 Median : 33.00 Median :14.0
## Mean : 52.50 Mean :10.19 Mean : 52.82 Mean :14.6
## 3rd Qu.: 78.25 3rd Qu.:12.00 3rd Qu.: 96.00 3rd Qu.:17.0
## Max. :104.00 Max. :17.00 Max. :137.00 Max. :27.0
## East.Kalimantan
## Min. : 3.000
## 1st Qu.: 8.000
## Median :10.000
## Mean : 9.596
## 3rd Qu.:12.000
## Max. :18.000
There are 4 sites, and 104 weeks of observations at each location. Each observation is an individual generation and so is independent.
For easier analysis, transform the data to long format with only 3 columns: Week, Abundance and Site
# Reshape data to long format
ab_long <- abundance %>%
pivot_longer(cols = c(Sabah, Sarawak, West.Kalimantan, East.Kalimantan),
names_to = "Site",
values_to = "Abundance")
# Check conversion completed successfuly
str(ab_long)
## tibble [416 × 3] (S3: tbl_df/tbl/data.frame)
## $ Week : int [1:416] 1 1 1 1 2 2 2 2 3 3 ...
## $ Site : chr [1:416] "Sabah" "Sarawak" "West.Kalimantan" "East.Kalimantan" ...
## $ Abundance: int [1:416] 11 110 12 8 9 99 16 3 10 103 ...
names(ab_long)
## [1] "Week" "Site" "Abundance"
summary(ab_long)
## Week Site Abundance
## Min. : 1.00 Length:416 Min. : 3.0
## 1st Qu.: 26.75 Class :character 1st Qu.: 10.0
## Median : 52.50 Mode :character Median : 13.0
## Mean : 52.50 Mean : 21.8
## 3rd Qu.: 78.25 3rd Qu.: 18.0
## Max. :104.00 Max. :137.0
attach(ab_long)
Just to understand the dynamics of what is happening at each site, lets have a look at the abundance over time.
# Plot Abundance overtime with different colours for different sites:
ggplot(ab_long, aes(x = Week, y = Abundance, colour = Site)) +
geom_line() +
labs(
title = "",
x = "Week",
y = "Abundance"
) +
scale_colour_manual(values = site_colours) +
theme_minimal() +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "lightgrey"),
axis.ticks = element_line(colour = "lightgrey"),
legend.position = "right",
panel.background = element_rect(fill = "white", colour = "white")
)
Create histogram to understand distributions:
# Generate ggplot histograms for each site
ggplot(ab_long, aes(x = Abundance, fill = Site)) +
geom_histogram(binwidth = 1, colour = "black", size = 0.1, alpha = 0.8) +
scale_fill_manual(values = site_colours) +
facet_wrap(~ Site, ncol = 2, scales = "fixed") +
labs(
title = "",
x = "Abundance",
y = "Count") +
theme_minimal()+
theme(
panel.grid = element_blank(),
panel.background = element_blank(),
panel.border = element_blank(),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(colour = "black"),
strip.background = element_blank(),
strip.text = element_text(face = "bold"),
panel.spacing = unit(1, "lines"),
legend.position = "none"
)
Sarawak is not normally distributed while the others appear to be normal.
Create box plot to visualise data another way:
# Visualise results in a boxplot
ggplot(ab_long, aes(x = Site, y = Abundance, fill = Site)) +
geom_boxplot() +
labs(x = "Sites", y = "Abundance") +
theme_bw() +
scale_fill_manual(values = site_colours)+
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(colour = "black"),
legend.position = "right"
)
Looks like a large difference between Sarawak and other sites. Lets check the relationship:
Another way to test if data is normally distributed using Shapiro Wilk test (where p values of less than 0.05 shows non parametric date
# Apply Shapiro-Wilk test for normality to each site
by_site_normality <- ab_long %>%
group_by(Site) %>%
summarise(p_value = shapiro.test(Abundance)$p.value)
print(by_site_normality)
## # A tibble: 4 × 2
## Site p_value
## <chr> <dbl>
## 1 East.Kalimantan 0.0776
## 2 Sabah 0.0827
## 3 Sarawak 0.00000000462
## 4 West.Kalimantan 0.143
As Sarawak is not normally distributed (p<0.05), a non parametric Kruskal Wallis test is used to test significance:
# Kruskal-Wallis test
kruskal.test(Abundance ~ Site, data = ab_long)
##
## Kruskal-Wallis rank sum test
##
## data: Abundance by Site
## Kruskal-Wallis chi-squared = 257.32, df = 3, p-value < 2.2e-16
The p-value is < 0.05, so the null hypothesis (there is no difference in abundance between sites) can be rejected. This means there is a significant difference in butterfly abundance between at least two of the monitoring sites. To determine which sites, a post hoc test is used (Dunn Test):
# Dunn test
dunn_test_result <- dunn.test(Abundance, Site)
## Kruskal-Wallis rank sum test
##
## data: Abundance and Site
## Kruskal-Wallis chi-squared = 257.3239, df = 3, p-value = 0
##
##
## Comparison of Abundance by Site
## (No adjustment)
## Col Mean-|
## Row Mean | East.Kal Sabah Sarawak
## ---------+---------------------------------
## Sabah | -0.714080
## | 0.2376
## |
## Sarawak | -14.02971 -13.31563
## | 0.0000* 0.0000*
## |
## West.Kal | -7.159291 -6.445210 6.870423
## | 0.0000* 0.0000* 0.0000*
##
## alpha = 0.05
## Reject Ho if p <= alpha/2
# Display results
dunn_test_result
## $chi2
## [1] 257.3239
##
## $Z
## [1] -0.7140803 -14.0297145 -13.3156342 -7.1592910 -6.4452107 6.8704235
##
## $P
## [1] 2.375887e-01 5.128178e-45 9.388559e-41 4.054769e-13 5.771998e-11
## [6] 3.200578e-12
##
## $P.adjusted
## [1] 2.375887e-01 5.128178e-45 9.388559e-41 4.054769e-13 5.771998e-11
## [6] 3.200578e-12
##
## $comparisons
## [1] "East.Kalimantan - Sabah" "East.Kalimantan - Sarawak"
## [3] "Sabah - Sarawak" "East.Kalimantan - West.Kalimantan"
## [5] "Sabah - West.Kalimantan" "Sarawak - West.Kalimantan"
| Comparison | Z | p-value | Significant (at p<0.05) |
|---|---|---|---|
| East Kalimantan vs Sabah | -0.71 | 0.2376 | No |
| East Kalimantan vs Sarawak | -14.03 | 5.13 x 10-45 | Yes |
| East Kalimantan vs West Kalimantan | -7.16 | 4.05 x 10-13 | Yes |
| Sabah vs Sarawak | -13.32 | 9.39 x 10-41 | Yes |
| Sabah vs West Kalimantan | -6.45 | 5.77 x 10-11 | Yes |
| Sarawak vs West Kalimantan | 6.87 | 3.20 x 10-12 | Yes |
The abundance between East Kalimantan and Sabah was not significantly different (p = 0.2376), while significant differences were found between the other pairs of sites (East Kalimantan vs Sarawak, East Kalimantan vs West Kalimantan, Sabah vs Sarawak, Sabah vs West Kalimantan, and Sarawak vs West Kalimantan).
Resilience is determined by the relationship between abundance and population growth rate (PGR).
This dataset will be used to determine:
Key Questions:1) What is the expected model formula?
2) What is the expected relationship between response and explanatory variables?
3) What distribution best describes the data?
|
Firstly, inspect the data:
str(resilience) # inspect data
## 'data.frame': 103 obs. of 9 variables:
## $ Week : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sabah : int 11 9 10 10 10 12 16 13 14 5 ...
## $ PGR.Sabah : num -0.201 0.105 0 0 0.182 0.288 -0.208 0.074 -1.03 0.182 ...
## $ Sarawak : int 110 99 103 117 132 117 137 114 100 98 ...
## $ PGR.Sarawak : num -0.105 0.04 0.127 0.121 -0.121 0.158 -0.184 -0.131 -0.02 -0.052 ...
## $ West.Kalimantan : int 12 16 17 13 13 4 14 20 16 16 ...
## $ PGR.W.Kalimantan: num 0.288 0.061 -0.268 0 -1.179 ...
## $ East.Kalimantan : int 8 3 11 11 9 12 14 10 13 10 ...
## $ PGR.E.Kalimantan: num -0.981 1.299 0 -0.201 0.288 ...
names(resilience) # check headers
## [1] "Week" "Sabah" "PGR.Sabah" "Sarawak"
## [5] "PGR.Sarawak" "West.Kalimantan" "PGR.W.Kalimantan" "East.Kalimantan"
## [9] "PGR.E.Kalimantan"
summary(resilience) # summary of data
## Week Sabah PGR.Sabah Sarawak
## Min. : 1.0 Min. : 4.00 Min. :-1.1790000 Min. : 10.00
## 1st Qu.: 26.5 1st Qu.: 8.00 1st Qu.:-0.2230000 1st Qu.: 22.00
## Median : 52.0 Median :10.00 Median : 0.0000000 Median : 33.00
## Mean : 52.0 Mean :10.19 Mean :-0.0009709 Mean : 53.16
## 3rd Qu.: 77.5 3rd Qu.:12.00 3rd Qu.: 0.2370000 3rd Qu.: 96.00
## Max. :103.0 Max. :17.00 Max. : 1.3220000 Max. :137.00
## PGR.Sarawak West.Kalimantan PGR.W.Kalimantan East.Kalimantan
## Min. :-0.48000 Min. : 4.00 Min. :-1.1790000 Min. : 3.000
## 1st Qu.:-0.12600 1st Qu.:12.00 1st Qu.:-0.2230000 1st Qu.: 8.000
## Median :-0.03000 Median :14.00 Median : 0.0000000 Median :10.000
## Mean :-0.01757 Mean :14.62 Mean : 0.0000291 Mean : 9.602
## 3rd Qu.: 0.11450 3rd Qu.:17.00 3rd Qu.: 0.2700000 3rd Qu.:12.000
## Max. : 0.45200 Max. :27.00 Max. : 1.2530000 Max. :18.000
## PGR.E.Kalimantan
## Min. :-1.386000
## 1st Qu.:-0.336000
## Median : 0.000000
## Mean : 0.001097
## 3rd Qu.: 0.318000
## Max. : 1.386000
There are 4 sites, and 103 weeks of observations at each location. Each observation is an individual generation.
Reshape data for easier graphing and analysis:
# Reshape Abundance values
res_long <- resilience %>%
select(Week, Sabah, Sarawak, West.Kalimantan, East.Kalimantan) %>%
pivot_longer(
cols = -Week,
names_to = "Site",
values_to = "Abundance")
# Reshape PGR values
PGR_long <- resilience %>%
select(Week, PGR.Sabah, PGR.Sarawak, PGR.W.Kalimantan, PGR.E.Kalimantan) %>%
pivot_longer(cols = -Week,
names_to = "Site",
values_to = "PGR" ) %>%
mutate(
Site = recode(Site,
PGR.Sabah = "Sabah",
PGR.Sarawak = "Sarawak",
PGR.W.Kalimantan = "West.Kalimantan",
PGR.E.Kalimantan = "East.Kalimantan"))
# Combine abundance and PGR data
res_long <- res_long %>%
left_join(PGR_long, by = c("Week", "Site"))
# Check dataset
str(res_long)
## tibble [412 × 4] (S3: tbl_df/tbl/data.frame)
## $ Week : int [1:412] 1 1 1 1 2 2 2 2 3 3 ...
## $ Site : chr [1:412] "Sabah" "Sarawak" "West.Kalimantan" "East.Kalimantan" ...
## $ Abundance: int [1:412] 11 110 12 8 9 99 16 3 10 103 ...
## $ PGR : num [1:412] -0.201 -0.105 0.288 -0.981 0.105 ...
names(res_long)
## [1] "Week" "Site" "Abundance" "PGR"
summary(res_long)
## Week Site Abundance PGR
## Min. : 1 Length:412 Min. : 3.00 Min. :-1.386000
## 1st Qu.: 26 Class :character 1st Qu.: 10.00 1st Qu.:-0.216250
## Median : 52 Mode :character Median : 13.00 Median : 0.000000
## Mean : 52 Mean : 21.89 Mean :-0.004354
## 3rd Qu.: 78 3rd Qu.: 18.00 3rd Qu.: 0.223000
## Max. :103 Max. :137.00 Max. : 1.386000
Create histogram to understand distributions:
# Generate ggplot histograms for each site
ggplot(res_long, aes(x = PGR, fill = Site)) +
geom_histogram(binwidth = 0.2, colour = "black", size = 0.1, alpha = 0.8) +
scale_fill_manual(values = site_colours) +
facet_wrap(~ Site, ncol = 2, scales = "fixed") +
labs(
title = "",
x = "Population Growth Rate",
y = "Count") +
theme_minimal()+
theme(
panel.grid = element_blank(),
panel.background = element_blank(),
panel.border = element_blank(),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(colour = "black"),
strip.background = element_blank(),
strip.text = element_text(face = "bold"),
panel.spacing = unit(1, "lines"),
legend.position = "none"
)
Data looks normally distributed (Gaussian) . Look at scatter plot of data:
# Create scatter plots for each site
ggplot(res_long, aes(x = Abundance, y = PGR, colour = Site)) +
geom_point(alpha = 0.6) +
facet_wrap(~Site, scales = "free") +
labs(
title = "",
x = "Abundance",
y = "Population Growth Rate (PGR)"
) +
scale_colour_manual(values = site_colours) +
theme_minimal() +
theme(
panel.grid = element_blank(),
panel.background = element_blank(),
panel.border = element_blank(),
strip.text = element_text(face = "bold"),
legend.position = "none",
strip.background = element_blank(),
axis.line = element_line(colour = "lightgrey"),
axis.ticks = element_line(colour = "lightgrey")
)
These all look linear relationships so lets see what they look like together, with linear regression lines:
# Scatter plot with GLM lines
ggplot(res_long, aes(x = Abundance, y = PGR, colour = Site, group = Site)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "",
x = "Abundance",
y = "Population Growth Rate (PGR)"
) +
scale_colour_manual (values = site_colours) +
theme_classic() +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "lightgrey"),
axis.ticks = element_line(colour = "lightgrey"),
legend.title = element_text(size = 11))
## `geom_smooth()` using formula = 'y ~ x'
Lets start with a linear model and check the diagnostics and summary information
attach(res_long)
# Linear model with interaction
m1 <- lm(PGR ~ Abundance * Site)
# inspect diagnostics
par( mfrow = c(2, 2) )
plot(m1)
# check summary
summary(m1)
##
## Call:
## lm(formula = PGR ~ Abundance * Site)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.29582 -0.14643 0.01079 0.18670 0.69559
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.003646 0.095422 10.518 < 2e-16 ***
## Abundance -0.104411 0.009468 -11.028 < 2e-16 ***
## SiteSabah 0.124612 0.146492 0.851 0.39547
## SiteSarawak -1.004223 0.107562 -9.336 < 2e-16 ***
## SiteWest.Kalimantan 0.049562 0.149478 0.332 0.74039
## Abundance:SiteSabah -0.006361 0.014157 -0.449 0.65345
## Abundance:SiteSarawak 0.104091 0.009498 10.959 < 2e-16 ***
## Abundance:SiteWest.Kalimantan 0.032381 0.012150 2.665 0.00801 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2944 on 404 degrees of freedom
## Multiple R-squared: 0.4438, Adjusted R-squared: 0.4341
## F-statistic: 46.04 on 7 and 404 DF, p-value: < 2.2e-16
Looks good but just to check, we can check with the additive model:
# Linear model with addition
m2 <- lm(PGR ~ Abundance + Site)
par(mfrow = c(2,2))
plot(m2)
par (mfrow = c(1,1))
The second model (m2) doesn’t look right on the diagnostic plots. We can compare with model 1 to check
anova(m1,m2)
## Analysis of Variance Table
##
## Model 1: PGR ~ Abundance * Site
## Model 2: PGR ~ Abundance + Site
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 404 35.006
## 2 407 62.144 -3 -27.138 104.4 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The interaction model (m1) is the better model based on diagnostic plots, plus the p-value (<0.05) shows that m1 improves the model significantly over m2.
There is a site specific relationship between abundance and PGR, and we can look at the summary aov:
summary (aov(m1))
## Df Sum Sq Mean Sq F value Pr(>F)
## Abundance 1 0.55 0.553 6.379 0.0119 *
## Site 3 0.24 0.079 0.911 0.4354
## Abundance:Site 3 27.14 9.046 104.398 <2e-16 ***
## Residuals 404 35.01 0.087
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Abundance has a significant effect on PGR so when there is a higher abundance, there tends to be a lower rate of population growth.
Site is not significant on its own with regards to PGR, there is no clear overall difference in PGR between the four sites.
However, the interaction between the two (abundance AND site) is highly significant. So the effect of butterfly abundance on population growth (the resilience of the population) does vary across the four regions. The null hypothesis that there is no difference between sites can be rejected.
Looking at the coeficients from the m1 summary for each site:
summary_m1 <- summary(m1)
# Coefficients from the model summary
coefs <- summary_m1$coefficients
# Main effect for Abundance
abundance_coef <- coefs["Abundance", "Estimate"]
abundance_se <- coefs["Abundance", "Std. Error"]
abundance_t <- coefs["Abundance", "t value"]
abundance_p <- coefs["Abundance", "Pr(>|t|)"]
# Interaction coefficients for each site
sab <- coefs["Abundance:SiteSabah", "Estimate"]
sar <- coefs["Abundance:SiteSarawak", "Estimate"]
wkal <- coefs["Abundance:SiteWest.Kalimantan", "Estimate"]
# Interaction t-values and p-values
sab_t <- coefs["Abundance:SiteSabah", "t value"]
sab_p <- coefs["Abundance:SiteSabah", "Pr(>|t|)"]
sar_t <- coefs["Abundance:SiteSarawak", "t value"]
sar_p <- coefs["Abundance:SiteSarawak", "Pr(>|t|)"]
wkal_t <- coefs["Abundance:SiteWest.Kalimantan", "t value"]
wkal_p <- coefs["Abundance:SiteWest.Kalimantan", "Pr(>|t|)"]
# Standard errors for the interaction terms
sab_se <- coefs["Abundance:SiteSabah", "Std. Error"]
sar_se <- coefs["Abundance:SiteSarawak", "Std. Error"]
wkal_se <- coefs["Abundance:SiteWest.Kalimantan", "Std. Error"]
# Adjusted gradients for each site (main effect + interaction term)
gradients <- data.frame(
Site = c("East Kalimantan", "Sabah", "Sarawak", "West Kalimantan"),
Gradient = c(
abundance_coef, # East Kalimantan (reference site)
abundance_coef + sab, # Sabah
abundance_coef + sar, # Sarawak
abundance_coef + wkal), # West Kalimantan
SE = c(
abundance_se,
sab_se,
sar_se,
wkal_se),
t_value = c(
abundance_t,
sab_t,
sar_t,
wkal_t ),
p_value = c(
abundance_p,
sab_p,
sar_p,
wkal_p ))
# display results
print(gradients)
## Site Gradient SE t_value p_value
## 1 East Kalimantan -0.1044110357 0.009467594 -11.0282544 6.694885e-25
## 2 Sabah -0.1107719655 0.014157074 -0.4493111 6.534482e-01
## 3 Sarawak -0.0003197324 0.009497881 10.9594234 1.204349e-24
## 4 West Kalimantan -0.0720301743 0.012150064 2.6650774 8.005564e-03
| Site | Gradient | SE | t-value | p-value | Significant? |
|---|---|---|---|---|---|
| East Kalimantan | -0.1044 | 0.009 | -11.03 | 6.69 x 10-25 | Yes |
| Sabah | -0.1108 | 0.014 | -0.45 | 0.65 | No |
| Sarawak | -0.0003 | 0.009 | 10.96 | 1.20 x 10-24 | Yes |
| West Kalimantan | -0.0720 | 0.12 | 2.67 | 8.01 x 10-3 | Yes |
To work out the carrying capacity, we need to work out the abundance where the linear regression line crosses the 0.0 population growth rate line (y axis). As shown below:
# Fit the linear model for each Site
m1_by_site <- res_long %>%
group_by(Site) %>%
do(model = lm(PGR ~ Abundance, data = .))
par(mfrow = c(2, 2), mai = c(0.5, 0.5, 0.5, 0.5))
# Loop through each site to plot
for (site_name in unique(res_long$Site)) {
site_data <- res_long %>% filter(Site == site_name)
site_model <- lm(PGR ~ Abundance, data = site_data)
intercept <- coef(site_model)[1] # Extract model coefficients
slope <- coef(site_model)[2]
x_intercept <- -intercept / slope # Calculate the x-intercept (K)
# Plot the scatter plot with regression line
plot(
site_data$Abundance, site_data$PGR,
col = site_colours[site_name],
pch = 16,
xlab = "Abundance",
ylab = "PGR",
main = site_name,
xlim = range(site_data$Abundance, na.rm = TRUE),
ylim = range(site_data$PGR, na.rm = TRUE) )
abline(site_model, col = site_colours[site_name], lwd = 2)
abline(h = 0, lty = 2, col = "grey") # Add dashed horizontal line at PGR = 0
legend(
"topright", legend = paste("K =", round(x_intercept, 0)),
bty = "n", text.col = site_colours[site_name]
)
}