Welcome to Computer Lab 2 for the Data Analysis (DA) component of BIO2POS!
In DA Topic 2, we introduced the concept of the \(t\)-distribution, and covered how to conduct one sample \(t\)-tests, paired \(t\)-tests, and two sample \(t\)-tests. We also went over the assumptions of these different tests, and outlined the non-parametric tests we could use in situations where the \(t\)-test assumptions were violated.
In this computer lab, you will continue to learn how to use the statistical software jamovi, and conduct various \(t\)-tests and the equivalent non-parametric tests using real data sets. You will also learn how to interpret and summarise jamovi output for these tests.
These labs are designed to provide you with plenty of opportunities to practice different aspects of the statistical content covered in the lectures.
Each lab consists of core questions (with the 🌱 symbol) and extension questions (with the 🌳 symbol).
Having completed this lab, you will be able to obtain and discuss the following statistical outputs and results for a data set in jamovi:
Before you begin, please check the following:
Please complete at least step 1. first, as doing so will help you to better understand the concepts you will need for this computer lab.
To begin, we will return to the red crab data, collected by Green (1997), which we began analysing in DA Computer Lab 1. However in this lab we will assess an extended version of this data set, which has recorded values for the following variables:
Figure 0.1: Note. From File:Christmas Island Red Crab.jpg, by ChrisBrayPhotography, 2017, Wikimedia Commons (https://commons.wikimedia.org/). CC BY-SA 4.0 DEED
This red crab data is available in the Week 4 tile on LMS, in the file crab_data_extended.omv. Download this file now,
and save it on your computer. Also open up a Word document, in which you can write down your responses and
save your jamovi output as you work through the lab.
You may like to save the data and Word document to your OneDrive, so you can access them easily at a later date.
Open up jamovi, and click on the burger menu (the three horizontal white bars) on the top left. You should see a side panel appear. Click on Open, and load in the crab_data_extended.omv file.
Suppose that research on a similar variety of crab has previously established that the mean (i.e. average) claw
length of those crabs was 40mm. We would like to determine if the recorded data in the crab_data_extended.omv file aligns with this result.
Over the next few steps, we will conduct a one sample \(t\)-test in jamovi, to determine whether the mean claw length of the red crabs sample we have is also 40mm.
For this question you can assume that the claw lengths are known to come from a normal distribution. As you progress, copy the relevant output into your Word document.
If you would like to refresh your memory on one sample t-tests, check the Topic 2A lecture.
To begin, click on the Analyses tab, and then click on T-Tests and select One Sample T-Test.
Since we are interested in the CLAW length of the crabs, drag the CLAW variable across to the Dependent Variables box. You will see that some automatic results will already appear in the Results section.
Under the Hypothesis heading, change the test value to 40, since this is our fixed reference value in this instance.
Under the Additional Statistics heading, select Mean difference, Confidence interval and Effect size
Interpret the mean difference value presented in the output.
Under the Additional Statistics heading, also select Descriptives and use this information to compute the Cohen’s \(d\) effect size by hand.
Confirm your result matches the jamovi output (do not worry if it slightly different, as this may be due to rounding).
See slides 11-12 of the Topic 2A lecture for effect size details.
It is important to note that for jamovi one sample t-test output, the confidence interval reported is an interval for the difference from the specified reference value.
If our specified reference value is 0 then, as we might expect, the confidence interval will simply be for the parameter itself (e.g. a range for the likely mean CLAW value). However, if we specify a non-zero reference value, as we have done above in 1.1.2, the confidence interval values shown should either be added to the reference value, or interpreted in the context of being the likely range for the mean difference.
To understand this better, try changing the test value back to 0, and compare how the results change.
Check the one sample \(t\)-test assumptions via:
CLAW observations, from the Exploration sectionChecks 2. and 3. can be selected under the Assumption Checks heading in the One Sample T-Test section.
If you do find that the assumptions are violated, you can conduct the non-parametric Wilcoxon Signed Rank test by selecting Wilcoxon rank under the Tests heading in the One Sample T-Test section.
Write a clear summary based on your \(t\)-test output, in the style presented in the lectures. Regardless of your findings in 1.1.6, assume that the relevant test assumptions have been satisfied.
Make sure to include the effect size and a 95% confidence interval (either of the CLAW length itself, or of the difference between the CLAW sample mean and the reference value).
Suppose we are also interested in determining whether the average lengths of red crabs’ two claws differ. Because we are comparing multiple data points from the same animal (with 2 observations per crab), we should use a paired \(t\)-test here.
To begin, click on the Analyses tab, and then click on T-Tests and select Paired Samples T-Test.
Using the CLAW and OtherClaw variables, conduct a paired \(t\)-test in jamovi to test whether there is a difference in the mean length of the red crabs’ claws.
For this question you can assume that the differences in claw lengths are known to come from a normal distribution. Copy the relevant output into your Word document.
Are the paired \(t\)-test assumptions satisfied? Explain, with reference to the appropriate results.
Write a clear summary based on your paired \(t\)-test output, in the style presented in the lectures. Make sure to include the effect size and confidence interval.
Provide a clear interpretation of the confidence interval produced as part of the paired \(t\)-test. Make sure to consider why this confidence interval supports your conclusion above.
To conclude our red crab study and \(t\)-tests overview, suppose we are interested in comparing the mean weights of male and female red crabs.
To begin, click on the Analyses tab, and then click on T-Tests and select Independent Samples T-Test.
The terms Independent Samples \(t\)-test and Two Sample \(t\)-test are synonymous.
Conduct a two sample \(t\)-test in jamovi to compare the mean weight of male and female red crabs.
For this question you can assume that the weights are known to come from a normal distribution. Copy the relevant output into your Word document.
You can separate the data by SEX by placing the SEX variable in the Grouping Variable box.
As part of your analysis, conduct a Levene’s test to check the equal variances assumption, by selecting the Homogeneity test box under the Assumption Checks heading.
Based on the test result, should you use the Student’s or Welch’s version of the two sample \(t\)-test should you use, and why? Make sure to select the appropriate box under the Tests heading.
Write a clear summary based on your independent samples \(t\)-test output, in the style presented in the lectures. Make sure to include the effect size and confidence interval for the mean difference.
Suppose that you have concerns about the test assumptions for your two sample \(t\)-test. Repeat your comparison of the weights of the male crabs and female crabs, this time using the non-parametric Mann-Whitney U test. Produce the relevant output, save a copy in your Word document, and write a clear summary, in the style presented in the lectures.
As part of this question, you may like to produce assumption check results, such as a Normal Q-Q plot.
The Indo-Pacific Lionfish (Pterois volitans/miles) is an invasive species in parts of the Atlantic Ocean and the Carribean Sea. Commercial fishing of lionfish has been proposed as a means of controlling population sizes. However, concerns have been raised about the safety of lionfish for human consumption, as they may contain potentially harmful levels of organic methylmercury (MeHg) due to bioaccumulation.
Figure 4.1: Note. From File:Common lion fish Pterois volitans.jpg, by Michael Gäbler, 2014, Wikimedia Commons (https://commons.wikimedia.org/). CC BY 3.0 DEED
Johnson et al. (2021) studied the total mercury (THg) levels in lionfish specimens taken from two locations in Florida.
Data from their study is available in the file lionfish_thg.omv in this week’s tile on LMS, and contains recorded values for the following variables:
Create a descriptives table in row format for the lionfish_thg.omv data, using the variables THG, SEX and LOCATION.
Comment on any interesting values you observe.
Create a histogram of the THG observed values with a density curve overlaid. Looking at the distribution, do you think that it is appropriate to use a \(t\)-test to analyse this data? Explain your reasoning.
Regardless of your answer to the previous question, suppose that you now would like to conduct a one sample \(t\)-test of this THG data.
Assume that the recommended limit for mercury concentration is 1 milligram per kilogram of fish (equivalent to 1 microgram per gram), in accordance with e.g. US EPA standards.
Suppose that it is currently believed that eating lionfish is borderline unsafe, and that the average lionfish mercury concentration (THG) is 1 microgram per gram. However, like Johnson et al. (2021), you would like to test if the mean THG levels are actually less than 1 microgram per gram.
Write out an appropriate null and alternative hypothesis for your test, using 1 microgram per gram as your reference value.
Conduct a one sample \(t\)-test of the THG data in jamovi, using your specified hypotheses. Record the test statistic, \(p\)-value, mean difference, and the 95% confidence interval for the mean difference.
Based on your results, what is your conclusion? Does it appear that lionfish are safe, or unsafe, to eat?
Compute and interpret the effect size for your one sample \(t\)-test.
Check the test assumptions of your one sample \(t\)-test via the Shapiro-Wilk test and Q-Q plot inspection. What do you conclude?
Regardless of your findings in the previous question, suppose you decide to conduct an equivalent non-parametric test of the THG levels. Note down the appropriate non-parametric test to use, and then carry this test out in jamovi, and provide a brief summary of your results.
Suppose you are interested in assessing if male and female lionfish exhibit different concentrations of mercury.
Conduct a two sample \(t\)-test to compare the mean THG levels of male and female lionfish. Make sure to check the test assumptions and compute the effect size, and write a short summary detailing your findings.
If you find that the two sample \(t\)-test assumptions are violated, conduct the appropriate non-parametric equivalent test.
Recall that in DA Computer Lab 1 we introduced a raw, messy data set on dwarf pea plant seedlings, which had
been collected as part of an experiment in an LTU BIO1AP lab class in 2022. Figure 5.2 below contains this data.
Previously, we produced descriptive statistics and some initial plots of this data. In this DA computer lab, now that we have learnt how to conduct various t-tests in jamovi, we can begin to properly analyse this data, and test hypotheses.
Figure 5.1: Note. From File:Leaves of Pisum sativum (2).JPG, by Chmee2, 2011, Wikimedia Commons (https://commons.wikimedia.org/). CC BY 3.0 DEED
To recap, in this experiment dwarf pea plant (Pisum sativum) seedlings were exposed to different concentrations of gibberellic acid (GA), in order to study the effect of GA application on plant growth. These dwarf pea plants are naturally deficient in GA, due to a mutation of a gene in the pathway for biosynthesis of GA. Therefore it is of interest to determine if application of GA to the seedlings has an impact.
For the experiment, each pea plant seedling was assigned to one of three groups, and then carefully sprayed:
The height of the seedlings was then recorded at a later date. The pea plant data in Figure 5.2 has pea plant height (in mm) recordings, for the three treatments, across 7 different benches.
Note that the number of seedlings (1 to 6) in each of the three groups varied between benches, and that some recordings were crossed or scribbled out (perhaps due to the seedling being damaged or dying).
Figure 5.2: Pea Plant Raw Data
In DA Computer Lab 1, you should have created a data file in jamovi containing the cleaned pea plant data. If you have this file to hand, skip this step and proceed to 5.2. If for whatever reason you do not have this data file saved, please complete the following steps:
Data view.If you are stuck on any value, you may like to discuss this with other students and/or your lab demonstrator.
Suppose that based on previous studies, the mean height of pea plant seedlings which have been exposed to natural conditions is known to be 280mm. Using an appropriate \(t\)-test, test in jamovi if the mean height of the relevant pea plant seedlings in the jamovi data file you have prepared is different to 280mm.
Write a clear summary statement, and make sure to copy the relevant jamovi output to your Word document.
Assume that this mean value is for seedlings which have been growing for the same amount of time as the seedlings in the the BIO1AP experiment had been, when their data was recorded in Figure 5.2.
Suppose we would like to now compare the mean heights of the pea plant seedlings from the BIO1AP experiment, for the different treatments. Using the appropriate test(s) in jamovi, compare the mean heights of pea plant seedlings exposed to treatment C and treatment TA.
In order to conduct this test, you may need to reformat your data slightly. A simple option is to remove the rows of data for TB observations, and then once the analysis is complete, close jamovi without saving the adjustments. For the next question, you can repeat the process, but this time removing the rows of data for C observations.
As a separate test, also compare the mean heights of pea plant seedlings exposed to treatment TA and TB.
Write clear summary statements for your analyses in 5.3 and 5.4, and make sure to copy the relevant jamovi output to your Word document.
Make sure you check any relevant test assumptions before concluding your tests.
Before you finish up, make sure to save both your Word document and your pea plant jamovi file to your OneDrive, for future reference.
Green, P. T. (1997). Red crabs in rain forest on Christmas Island, Indian Ocean: activity patterns, density and biomass. Journal of Tropical Ecology, 13(1), 17-38
Johnson, E.G., Dichiera, A., Goldberg, D., Swenarton, M. and Gelsleichter, J. (2021). Total mercury concentrations in invasive lionfish (Pterois volitans/miles) from the Atlantic coast of Florida. PLOS ONE 16(9): e0234534. https://doi.org/10.1371/journal.pone.0234534
These notes have been prepared by Rupert Kuveke and other members of the Department of Mathematical and Physical Sciences. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with the Department of Environment and Genetics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.