The epxperiment was performed to measure the biodiversity of a river. The biodiversity was measured by seeing if there was an equal number of fish in the pools as in the riffles. There were 15 different pools and riffles that were measured.
fish<-read.csv("fish.csv")
fish
## ï..location pool riffle
## 1 1 6 3
## 2 2 6 3
## 3 3 3 3
## 4 4 8 4
## 5 5 5 2
## 6 6 2 2
## 7 7 6 2
## 8 8 7 2
## 9 9 1 2
## 10 10 3 2
## 11 11 4 3
## 12 12 5 1
## 13 13 4 3
## 14 14 6 2
## 15 15 4 3
colnames(fish)<-c("location", "pool", "riffle")
The data has 15 rows and 2 columns. The rows represent each of the fifteen locations, while the columns represent which type of location (pool vs. riffle)
The purpose of the study is to measure if there are different amounts of fish in pools compared to riffles.
library(ggplot2)
ggplot(data=fish, mapping=aes(x=riffle, y=pool)) + geom_point() + geom_abline(slope=1, intercept=0) + annotate("text", x=1.25, y=4, label="More in Pool") + annotate("text", x=3.5, y=2, label="More in Riffle")
In the graph shown, each point represents one of the fifteen locations where a pool and riffle where measured. The straight line going through the graph is where points would be plotted if the amount of fish in pools and riffles was equal. The plot suggests that more fish live in pools than in riffles because more points are above the line than below the line. To verify these results a t-test was performed.
The null hypothesis for this study would be that there are not equal amounts of fish in the two pools.
The alternative hypothesis would be that the mean number of fish in the pools is different to the mean number of fish in the riffles.
To analyze the data a t-test was used to compare the means of the two groups.
To compare the two groups of fish a two-sample t-procedure should be used because there are two distinctive groups, but individuals within the groups are not matched in any way.
The provided qq-plot will help judge the assumptions of the test.
ggplot(data=fish) + geom_qq(mapping=aes(sample=pool))
If the data is normal, it will follow this trend.
The level of significance is 0.05 becasue it is a 95% confidence interval.
t.test(fish$pool,fish$riffle, paired=TRUE)
##
## Paired t-test
##
## data: fish$pool and fish$riffle
## t = 4.5826, df = 14, p-value = 0.0004264
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.170332 3.229668
## sample estimates:
## mean of the differences
## 2.2
The p-value is 0.0004264. For a 95 percent confidence interval the alpha value is 0.05, so the null hypothesis can be rejected because it is smaller than the alpha value.This means that there are different amounts of fish in the pools compared to the riffles.
It is not possible for the difference in means to be zero because as stated earlier, the mean number of fish in the pools and in the riffles is not the same.
The mean of differences is 2.2, which suggests that there are 2.2 more fish in pools than in riffles.
As specified earlier, the null hypothesis can be rejected, which suggests that there is a different amount of fish in the pools compared to the riffles. On average, there are 2.2 more fish in the pools than in the rffles. Since there are different amounts of fish in the pools and riffles, there is a lower biodiversity in the river.