Code
library(tidyverse)
library(tidymodels)
coral_raw <- read_csv("merged_cyclone_coral_v3.csv")We want to test whether there is actually a difference in coral cover before and after a cyclone. Because all reefs have varying coral cover pre-cyclone, I calculated the change in coral cover by subtracting post-cyclone from pre-cyclone data. A negative change means decline, whilst a positive change means coral recovery. I also ran t-tests with dead coral cover to see if there was a higher proportion of dead corals after a cyclone.
For LIVE CORAL:
H0: mu = 0 (no change in live coral cover)
H1: mu < 0 (decrease in live coral cover after a cyclone)
For DEAD CORAL:
H0: mu = 0 (no change in dead coral cover)
H1: mu > 0 (increase in dead cover after a cyclone)
Important to note the initial t-tests were done on the entire dataset, which breaks independence assumptions due to neighbouring reefs affecting each other.
library(tidyverse)
library(tidymodels)
coral_raw <- read_csv("merged_cyclone_coral_v3.csv")#simplify the dataset and select relevant columns. Calculate the change in live and dead coral by subtracting the post data from the pre data.
coral_reduced <- coral_raw |> drop_na() |> dplyr::select(
-REEF_ID, -SEASON, -Longitude, -Latitude, -Start_Date, -End_Date, -Pre_SAMPLE_DATE, -Post_SAMPLE_DATE
) |> mutate(
Change_LIVE_CORAL = Post_MEAN_LIVE_CORAL - Pre_MEAN_LIVE_CORAL,
Change_DEAD_CORAL = Post_MEAN_DEAD_CORAL - Pre_MEAN_DEAD_CORAL
)It seems very unusual that the mean and median change in coral cover is POSITIVE, suggesting an increase in coral cover after a cyclone. The t-test is not very useful. I had a look into the dataset and noticed a lot of the positive change in coral cover happens when the reef cover survey happened a long time after the actual cyclone.
paste0("Mean LIVE coral cover: ", round(mean(coral_reduced$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: 2.526"
paste0("Median LIVE coral cover: ", round(median(coral_reduced$Change_LIVE_CORAL), 3))[1] "Median LIVE coral cover: 2.893"
t.test(coral_reduced$Change_LIVE_CORAL, mu = 0, alternative = 'less')
One Sample t-test
data: coral_reduced$Change_LIVE_CORAL
t = 1.9896, df = 110, p-value = 0.9754
alternative hypothesis: true mean is less than 0
95 percent confidence interval:
-Inf 4.631261
sample estimates:
mean of x
2.525586
t.test(coral_reduced$Change_DEAD_CORAL, mu = 0, alternative = 'greater')
One Sample t-test
data: coral_reduced$Change_DEAD_CORAL
t = -0.23433, df = 110, p-value = 0.5924
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
-0.1199466 Inf
sample estimates:
mean of x
-0.01484685
The above t-test results basically say that there is no significant change between the pre-cyclone and post-cyclone survey coral cover.
I removed all rows where the post survey happened more than 1 year after the cyclone and re run the t-tests.
coral_filtered_post_survey <- coral_reduced |> filter(
Days_cyclone_end_to_post <= 365
#, Days_pre_to_cyclone_start <= 365
)paste0("Mean LIVE coral cover: ", round(mean(coral_filtered_post_survey$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: -0.606"
t.test(coral_filtered_post_survey$Change_LIVE_CORAL, mu = 0, alternative = 'less')
One Sample t-test
data: coral_filtered_post_survey$Change_LIVE_CORAL
t = -0.43685, df = 65, p-value = 0.3318
alternative hypothesis: true mean is less than 0
95 percent confidence interval:
-Inf 1.709434
sample estimates:
mean of x
-0.6062424
t.test(coral_filtered_post_survey$Change_DEAD_CORAL, mu = 0, alternative = 'greater')
One Sample t-test
data: coral_filtered_post_survey$Change_DEAD_CORAL
t = 1.8424, df = 65, p-value = 0.03499
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
0.0135022 Inf
sample estimates:
mean of x
0.143197
The above results have a smaller p-value. This can be interpreted as there being a more significant difference between pre- and post-cyclone coral cover. This should make sense, as the closer we get the sampling date to be within the cyclone period, the more relevant the effects of the cyclone are on the reef.
If we think about it this way: Imagine a cyclone occurred in January 2020, but the next survey date was November 2021, there is a ~2 year period where the change is not necesarily due to cyclones and wave energy.
I was interested to see if individual cyclones produced similar results in terms of coral cover change. I filtered to only include Cyclone Hamish as I thought it’s path looked interesting. T-test was no longer applicable due to small sample sizes, so I used the Wilcox signed rank test (DATA2002). p-value <0.05 for LIVE coral change suggests this cyclone had a big impact on the reef coral cover.
For Hamish, Wilcox test showed a significant decrease in the DEAD coral cover as well. This doesn’t make as much sense, unless the dead corals were swept away by the water?
coral_Hamish <- coral_filtered_post_survey |> filter(NAME == 'HAMISH')
paste0("Mean LIVE coral cover: ", round(mean(coral_Hamish$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: -16.119"
paste0("Mean LIVE coral cover: ", round(median(coral_Hamish$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: -16.767"
#qqnorm(coral_Hamish$Change_LIVE_CORAL)
wilcox.test(coral_Hamish$Change_LIVE_CORAL, mu = 0, alternative = 'less')
Wilcoxon signed rank exact test
data: coral_Hamish$Change_LIVE_CORAL
V = 1, p-value = 0.007813
alternative hypothesis: true location is less than 0
wilcox.test(coral_Hamish$Change_DEAD_CORAL, mu = 0, alternative = 'less')
Wilcoxon signed rank exact test
data: coral_Hamish$Change_DEAD_CORAL
V = 1, p-value = 0.007813
alternative hypothesis: true location is less than 0
I chose another cyclone to look into. Slightly different results. No significant change to LIVE coral cover, but DEAD coral cover increased significantly.
coral_Oswald <- coral_filtered_post_survey |> filter(NAME == 'OSWALD')
paste0("Mean LIVE coral cover: ", round(mean(coral_Oswald$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: 2.694"
paste0("Mean LIVE coral cover: ", round(median(coral_Oswald$Change_LIVE_CORAL), 3))[1] "Mean LIVE coral cover: 2.127"
wilcox.test(coral_Oswald$Change_LIVE_CORAL, mu = 0, alternative = 'less')
Wilcoxon signed rank exact test
data: coral_Oswald$Change_LIVE_CORAL
V = 33, p-value = 0.9883
alternative hypothesis: true location is less than 0
wilcox.test(coral_Oswald$Change_DEAD_CORAL, mu = 0, alternative = 'greater')
Wilcoxon signed rank exact test
data: coral_Oswald$Change_DEAD_CORAL
V = 36, p-value = 0.003906
alternative hypothesis: true location is greater than 0
These results seem to suggest that different cyclones are driving the changes in coral cover across the Capricorn bunker.
Performed ANOVA, where I tested the mean change in LIVE coral against the grouping factor of Cyclone Name. Result came back with significant differences between cyclones. This gives us justification that looking into cyclone parameters is worth it in terms of modelling and prediction.
cyclone_anova <- aov(Change_LIVE_CORAL ~ NAME, data = coral_filtered_post_survey)
summary(cyclone_anova) Df Sum Sq Mean Sq F value Pr(>F)
NAME 13 5430 417.7 7.67 3.6e-08 ***
Residuals 52 2832 54.5
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1