Analysing your fish data
1 Assessment
This session is assessed using MCQs (questions highlighted below). The actual MCQs can be found on the BS1070/MB1080 Blackboard site. The deadline is listed there and on the front page of the BS1070/MB1080 blackboard site. This assessment contributes 5% of module marks. You will receive feedback on this assessment after the submission deadline.
2 Getting the data into R
There are lots of ways of getting data into R. Its one of the most annoying things about it as a beginner. But I’m assuming everyone is using Rstudio, so I’ll show you how I get data in when someone gives me a csv file.
- Look at the right hand top window in Rstudio. See the Import Dataset. Use this to import the data as textfile or From Text (base) in newer versions. Make sure that the heading option is on.
- Notice what you really did was displayed in the console.
<- read.csv("~/Dropbox/Teaching/first_year_stats/sessions/7.fish/nonsense_data_2015.csv") nonsense
- That means if you typed that into the console you would get the same effect (with your filepath not mine).
- Have a look at the data it should have 235 observations of 10 variables.
3 Data analysis for hand-in
Overall today, is trying to understand if different species, which live in different environments have different basic anatomies. I want you to ask two separate questions:
- Does species have an effect on eye diameter and which species are different?
- Does species have an effect on gape height and which species are different?
As Dr. Norton described first you have to account for body length in all your data. Easiest way of doing this is by dividing the data you are interested in by body length.
library("dplyr")
<- mutate(fish, eyeindex = Eye.Diameter/Standard.Length) #Creating an index fish
Always use the index data in your analysis.
A good general framework for any analysis is Plot -> Model -> Check assumptions -> Interpret -> Plot again. We will follow this below.
3.1 Analysis for today
For each of the two variables, I need you to
- A quick explore of your data (maybe skimr or summary if you are having problems with skimr) and a boxplot to quickly check if species has an effect
- Carry out an ANOVA
- Check the assumptions of your model (autoplot)
- If significant and assumptions are met, do a tukey test, otherwise carry out a Kruskal wallis and then a dunn’s test
- A final plot (pretty boxplot?)
4 MCQs
- Does species have an effect on eye diameter (with correctly reported statistics (either parametric or nonparametric are fine))?
- If yes, which species are different from each other based on eye diameter (with correctly reported statistics (either parametric or nonparametric are fine))?
- Does species have an effect on gape height (with correctly reported statistics(either parametric or nonparametric are fine))?
- If yes, which species are different from each other based on gape height (with correctly reported statistics(either parametric or nonparametric are fine))?