fish <- read.csv ("fish file.csv")
fish
## location pool riffle
## 1 1 6 3
## 2 2 6 3
## 3 3 3 3
## 4 4 8 4
## 5 5 5 2
## 6 6 2 2
## 7 7 6 2
## 8 8 7 2
## 9 9 1 2
## 10 10 3 2
## 11 11 4 3
## 12 12 5 1
## 13 13 4 3
## 14 14 6 2
## 15 15 4 3
Describe data
There are three columns: the coloumn “pool” signifies the fish that are present in slow moving and deep areas of water. The “riffle” column shows the fish present in the fast, shallow regions of water. The third column is location, there were 15 locations. At each location the amount of fish present in the pool or riffles were noted.
Identify the purpose of the study
The purpose of this study was to measure whether the two habitats (pools and riffles) support equal number of species - which measures species diversity.
Visualize the data
library(ggplot2)
ggplot(data=fish, mapping=aes(x=riffle, y=pool)) + geom_point() + geom_abline(slope=1, intercept=0) + annotate("text", x=1.25, y=4, label="More in Pool") + annotate("text", x=3.5, y=2, label="More in Riffle")
interpret the plot
The plot suggests the higher presence of fish in the pool habitats then the riffle habitats. To elaborate, fish are more likely to be found in areas with slowing moving, deep waters.
Formulate the null hypothesis
The depth and speed of the waters do not matter when fish are searching for habitat.
Alternative Hypothesis
Population means are unequal - the mean is higher in the population of pool habitats.
Decide on a type of test
T test
Choose one sample or two
Two sample - pool and riffle
Check assumptions of the test
ggplot(data=fish) + geom_qq(mapping=aes(sample=riffle, group=pool))
level of significance of the test
.05 is the level of significance
Perform the Test
t.test(fish$riffle,fish$pool, data=fish)
##
## Welch Two Sample t-test
##
## data: fish$riffle and fish$pool
## t = -4.1482, df = 18.125, p-value = 0.0005961
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.313673 -1.086327
## sample estimates:
## mean of x mean of y
## 2.466667 4.666667
Interpret the p value
The p value is less than the level of significance, therefore, we reject the null hypothesis that says the means are equal.
Interpret the confidence interval
the means are not equal because the interval does not contain 0. Zero is not a plausible value for the parameter – the difference in means.Therefore, its not plausible that the means are equal.
Interpret the sample estimates
the mean of y improved more than the mean of x
State your conclusion
I conclude, after studying the data above, that because more fish were found in pool environments fish enjoy slow moving, deep waters moreso than shallow, fast moving waters.