13 March 2025
Goose Analysis or similartime spent vigilant (s) becomes t.vig_sCVS (Comma delimited) (*.csv)File > New Project... > then choose Existing DirectoryGoose Analysis) and select itFile > New File > R Scriptmy_analysis.R or whatever)Geese <- read.csv(file = "my data.csv")
To account for the fact that the individual video segments (observation times) vary somewhat, we
Geese$feed.tf <- Geese$time.feeding / Geese$clip.length Geese$vig.tf <- Geese$time.vigilant / Geese$clip.length Geese$hu.freq <- Geese$n.headups / Geese$clip.length * 60 # per minute
\(H_1:\) There is a correlation between time spent feeding and group size.
\(H_0:\) There is no correlation between time spent feeding and group size.
We’ll use package ggplot2. Simples:
library(ggplot2) # load the "ggplot2" plotting package; feeding <- ggplot(Geese, aes(x=flock.size, y=feed.tf)) # set up plot feeding + geom_point() # make scatter plot
You may feel the urge to add a regression line, which is easy enough:
feeding + geom_point() + geom_smooth(method = "lm", se = FALSE, linetype = 2)
Look at the data: it is pretty clear that if there is a relationship between feeding time and flock size, it is not a linear one.
There is no reason to assume a priori that the relationship would be linear:
Fit a smooth regression using "smooth" instead of "lm".
The span=1 bit just sets how bendy the line is allowed to be. Lower span means more wobbly (try it!).
feeding + geom_point() + geom_smooth(method = "loess", span = 1, se = FALSE)
feed.tf and flock.size.cor.test will carry out a Pearson’s correlation test. This assumes linearity!cor.test( ~ feed.tf + flock.size, Geese, method="spearman")
cor.test( ~ feed.tf + flock.size, Geese, method="spearman")
## ## Spearman's rank correlation rho ## ## data: feed.tf and flock.size ## S = 297.62, p-value = 0.006139 ## alternative hypothesis: true rho is not equal to 0 ## sample estimates: ## rho ## 0.6352753
\(H_1:\) There is a correlation between vigilance time and group size.
\(H_0:\) There is no correlation between vigilance time and group size.
vigil <- ggplot(Geese, aes(x=flock.size, y=vig.tf)) # set up plot vigil + geom_point() # make scatter plot
Let’s add both linear and smooth regression lines:
vigil + geom_point() + geom_smooth(method = "lm", se = FALSE, linetype = 2) + geom_smooth(method = "loess", span = 2, se = FALSE)
cor.test( ~ vig.tf + flock.size, Geese, method="spearman")
## ## Spearman's rank correlation rho ## ## data: vig.tf and flock.size ## S = 1268.6, p-value = 0.02085 ## alternative hypothesis: true rho is not equal to 0 ## sample estimates: ## rho ## -0.5546251
\(H_1:\) There is a correlation between head-up frequency and group size.
\(H_0:\) There is no correlation between head-up frequency and group size.
Let’s add the regression lines straight away
ggplot(Geese, aes(x=flock.size, y=hu.freq)) + geom_point() + geom_smooth(method = "lm", se = FALSE, linetype = 2) + geom_smooth(method = "loess", span = 1, se = FALSE)
If anything, there is a very weak trend for head-ups to be more frequent in larger groups. Maybe they check more often, but only very briefly?
cor.test( ~ hu.freq + flock.size, Geese, method="spearman")
## ## Spearman's rank correlation rho ## ## data: hu.freq and flock.size ## S = 621.37, p-value = 0.3566 ## alternative hypothesis: true rho is not equal to 0 ## sample estimates: ## rho ## 0.2385211
We can easily calculate the mean bout length, because we know both
Geese$mean.vbout.length <- with(Geese, time.vigilant / hu.freq)
ggplot(Geese, aes(x=flock.size, y=mean.vbout.length)) + geom_point() + geom_smooth(method = "lm", se = FALSE, linetype = 2) + geom_smooth(method = "loess", span = 1.5, se = FALSE)
cor.test( ~ mean.vbout.length + flock.size, Geese, method="spearman")
## ## Spearman's rank correlation rho ## ## data: mean.vbout.length and flock.size ## S = 1146.9, p-value = 0.1064 ## alternative hypothesis: true rho is not equal to 0 ## sample estimates: ## rho ## -0.4054838
Curiously, this doesn’t come out significant at \(\alpha = 0.05\) – even though we have good evidence that total time vigilant is higher in small groups, and we found no support for claiming that the frequency of vigilance bouts depends on group size.
But we are power-limited… : our \(N\) is low, and rank-based tests have less power.
ggplot(Geese, aes(x=log(flock.size), y=log(mean.vbout.length))) + geom_point() + geom_smooth(method = "lm", se = FALSE, linetype = 2) + geom_smooth(method = "loess", span = 1.5, se = FALSE)
But be mindful that trying different tests until one comes out ‘significant’ is wrong-headed at best (if done naively), and fraudulent at worst.
Either way, it is known as p-hacking (google it…).
cor.test( ~ log(mean.vbout.length) + log(flock.size),
Geese, method="pearson")
## ## Pearson's product-moment correlation ## ## data: log(mean.vbout.length) and log(flock.size) ## t = -2.9325, df = 15, p-value = 0.01029 ## alternative hypothesis: true correlation is not equal to 0 ## 95 percent confidence interval: ## -0.8404458 -0.1732788 ## sample estimates: ## cor ## -0.6036483
Suggests that the individual vigilance bouts are indeed shorter in larger groups.