fish <- read.csv("fish.csv")
fish
## location pool riffle
## 1 1 6 3
## 2 2 6 3
## 3 3 3 3
## 4 4 8 4
## 5 5 5 2
## 6 6 2 2
## 7 7 6 2
## 8 8 7 2
## 9 9 1 2
## 10 10 3 2
## 11 11 4 3
## 12 12 5 1
## 13 13 4 3
## 14 14 6 2
## 15 15 4 3
In this chart there are 15 rows and three columns. The rows represent the number of species captured at each location. The columns represent the variables. The first column is the location number and the second and third column represent the fish captured in the riffle or adjacent pool.
The purpose of the study is to determine whether riffles and pools support equal numbers of species.
library(ggplot2)
ggplot(data=fish, mapping=aes(x=riffle, y=pool)) + geom_point() + geom_abline(slope=1, intercept=0) + annotate("text", x=1.25, y=4, label="More in Pool") + annotate("text", x=3.5, y=2, label="More in Riffle")
There are more fish in the pool area than there are in the riffle area and in two different locations, there is the same amount of fish in the pool area as there is in the riffle area (represented by the line).
Mean of pool is the same as mean of riffle.
The means are unequal. The mean for the pool is larger than the mean for the riffle.
The type of test we would use would be T-test
Two sample because we are testing two types of locations- the pool and riffle locations.
gg <- ggplot(data=fish)
gg + geom_qq(mapping=aes(sample=fish$pool))
gg + geom_qq(mapping=aes(sample=fish$riffle))
0.05
t.test(fish$pool, fish$riffle, paired=TRUE)
##
## Paired t-test
##
## data: fish$pool and fish$riffle
## t = 4.5826, df = 14, p-value = 0.0004264
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.170332 3.229668
## sample estimates:
## mean of the differences
## 2.2
The p-value is less than the level of significance, so we will reject the null-hypothesis.
Confidence interval is a range of values that are plausable for the difference of means. However, zero is not plausable, so the means will not be the same. Therefore, the null-hypothesis is rejected. There is a 95 percent confidence interval between 1.170332 and 3.229668.
There are 2.2 more fish in the pool, on average, than the riffle
In conclusion, more fish prefer to be in locations that consist of pools rather than locations that are riffles.