Topic 2: t-tests in jamovi


These are the solutions for DA Computer Lab 2.

Please make sure to go over these after the lab session, and finish off any questions you may have missed during the lab.


Preparations: Red Crab Data

No answer required.

In this question we prepared to assess data on red crabs from Christmas island (Green, 1997).

The red crab data set contains recorded values for variables including:

  • CW: Carapace width (mm)
  • LEG: Leg length (mm)
  • CLAW: Length of dominant claw (mm)
  • WEIGHT: Weight (grams)
  • SEX: The sex of the crab (1 = female, 2 = male)
  • OtherClaw: Length of other claw (mm)

a.

No answer required.

b.

No answer required.

1 Red Crab One Sample t-test 🌱

No answer required.

1.1

No answer required.

1.1.1

No answer required.

1.1.2

No answer required.

1.1.3

The mean difference value of 8.043 suggests that the mean claw length in the sample data is 8.043 units larger than the reference value of 40 mm.

1.1.4

The Cohen’s \(d\) effect size was 0.370 in the jamovi output.

To compute this, we have

\[d = \dfrac{\bar{X} - \mu}{SD} = \dfrac{48.043 - 40}{21.741} \approx 0.370\]

1.1.5

If we change our reference value to 0, we observe the following:

Note that the 95% confidence interval \((45.453, 50.634)\) for the mean claw length is equivalent to the 95% confidence interval \((5.453, 10.634)\) for the difference between the reference value of 40 and the sample mean claw length.

1.1.6

The one sample \(t\)-test assumptions are discussed below:

The histogram of the claw values is clearly non-normal, and appears slightly right-skewed.

The Normal Q-Q plot appears reasonable, but we do observe some deviations from the theoretical line at the extremeties.

A Shapiro-Wilk test confirmed the data was non-normal, with \(p < .001\).

Due to the large sample size (\(N>30\)) we can still assume the distribution of the sample mean is normal, via the Central Limit Theorem (CLT). However, since the data appears slightly skewed, it may be more appropriate to conduct the non-parametric Wilcoxon Signed Rank test instead. Output for this test is included below. Note that the test has a significant result \((W = 25505)\), so we can conclude that the median crab claw length is statistically significantly different to 40 \((p < .001)\).

1.1.7

A one sample \(t\)-test was conducted to examine if the mean claw length of red crabs differed from 40mm. The sample mean claw length of the red crabs was different from 40mm \((M = 48.04, \, SD = 21.74, \, n = 273)\). This difference was statistically significant at the \(\alpha = 0.05\) level of significance, with, \(t(272) = 6.113, \, p < .001\). However the associated effect size was small, with \(d = 0.37\). With 95% confidence the mean claw length is between 5.45mm and 10.63mm larger than 40mm.


2 Red Crab Paired t-test 🌱

No answer required.

2.1

2.2

For conciseness, only the Shapiro-Wilk test results are shown here. Note that we cannot reject the assumption of normality, as \(p = 0.795 > 0.05\).

2.3

A paired \(t\)-test was conducted to compare the mean claw length of the left and right claws of \(N=273\) red crabs. The sample mean difference between the claw lengths was very small (\(M = 0.003,\, SE = 0.06\)). This difference was not clinically significant, with a negligible effect size of \(d = 0.003\), nor statistically significant at the \(\alpha = 0.05\) level of significance, with \(t(272) = 0.049,\, p = 0.961\).

The 95% confidence interval was \((-0.114, 0.120)\).

Note that normally the confidence interval would not be included here, since the results are not significant.

2.4

The 95% confidence interval \((-0.114, 0.120)\) suggests that 95% of the time, the true population mean difference in claw length between red crabs’ left and right claws will be between -0.114mm and 0.120mm. Since this interval contains 0, we cannot conclude that the true population mean difference will be non-zero, i.e. we cannot reject our null hypothesis that there is no difference in the mean lengths of the left and right claws of red crabs.


3 Red Crab Two Sample t-test 🌱

No answer required.

3.1

3.2

Since the Levene’s test test statistic \(F = 26.463\), with a \(p\)-value \(< 0.001\), we reject the null hypothesis that the variances of the two groups are equal. Therefore, we should use the Welch’s version of the two sample \(t\)-test.

3.3

An independent samples \(t\)-test was conducted to compare the mean weight of males red crabs and female red crabs. The mean weight of the male crabs (\(M = 181.965,\, SD = 145.788,\, n = 129)\) was greater than the mean weight of the female crabs \((M = 124.641,\, SD = 88.345,\, n = 144)\). The test indicated that this difference was statistically significant at the \(\alpha = 0.05\) level of significance, with \(t(206.102) = -3.874, p < .001\). The difference was also somewhat clinically significant, with a small effect size of \(d = 0.476\).

We are 95% confident that the true population mean weight of male crabs is between 28.151 and 86.498 grams more than the true population mean weight of female crabs.

3.4

A Mann-Whitney U Test was conducted to compare the weights of male red crabs and female red crabs. The male crabs tended to weigh more than female crabs and the Mann-Whitney test indicated a statistically significant difference in the distributions, with \(U = 7391, \, n_{Male} = 129,\, n_{Female} = 144,\, p = .004.\)


4 Lionfish Mercury Data 🌳

No answer required.

In this question we assessed data on lionfish specimens from a study by Johnson et al. (2021), with recorded values for the following variables:

  • SPECNUM: Specimen number
  • SL: Standard length of fish (mm)
  • TL: Total length of fish (mm)
  • WGT: Weight of fish (grams)
  • SEX: Sex of fish (male, female)
  • THG: Total dry weight mercury concentration in fish (microgram per gram)
  • LOCATION: Location in Florida where sampled

4.1

We observe that the mean THG values are similar across SEX and LOCATION, but interestingly the maximum THG value is much larger for female lionfish in Mid Florida (3.284 micrograms per gram) than the other lionfish categories.

4.2

The histogram of the THG observed values appears clearly asymmetric, so it may not be appropriate to use a \(t\)-test to analyse this data.

4.3

No answer required.

4.3.1

Let \(\mu_{THG}\) denote the population mean THG level in lionfish.

We have

\[H_0: \mu_{THG} = 1 \text{ versus } H_1: \mu_{THG} < 1\]

Note this is a directional test.

4.3.2

The one sample \(t\)-test of the mean had a test statistic \(t(140) = -27.457\), with \(p < .001\).

The mean difference was \(-0.726\) micrograms per gram, with a 95% confidence interval of \((-\infty, -0.683)\).

Based on these results, we can conclude that the lionfish are safe to eat, as their mean total mercury concentration levels are statistically significantly less than 1 microgram per gram, at the \(\alpha = 0.05\) level of significance \((p < .001)\). It is extremely unlikely that we would observe a mean THG level greater than \(1-0.683 = 0.317\) micrograms per gram in lionfish. This result is also highly clinically significant, with a large effect size of \(d = -2.312\).

4.3.3

The effect size provided in jamovi is \(-2.312\).

To compute this by hand, we have:

\[d = \dfrac{\bar{X} - \mu}{SD} = \dfrac{0.274 - 1}{0.314} \approx -2.312\]

4.3.4

Both the Shapiro-Wilk test results \((W = 0.407, p < .001)\) and the Normal Q-Q plot inspection suggest that the assumption of normality has been violated.

4.3.5

The appropriate non-parametric test to use in this context would be the Wilcoxon Signed Rank test. The results of this test are statistically significant at the \(\alpha = 0.05\) level of significance, with \(W= 150\), \(p < .001\), so we can conclude that the median lionfish total mercury concentration levels are statistically significantly different to 1 microgram per gram.

4.4

A two sample \(t\)-test was conducted to compare the mean THG levels of male and female lionfish, with the mean difference between groups equal to \(0.031\). Female lionfish had greater sample mean THG levels (\(M = 0.290\), \(SD = 0.404\)) than male lionfish \((M = 0.258\), \(SD = 0.195)\).

A Student’s \(t\)-test was conducted, with equal group variances assumed (Levene’s Test \(p\)-value \(= 0.278\)). Results of the Student’s \(t\)-test suggested the difference in population mean THG levels between male and female lionfish was not statistically significant, with \(t(139) = 0.592\), \(p = 0.555\). The Cohen’s \(d\) effect size was also negligible \((d = 0.1)\), suggesting the difference was also not clinically significant.

Checking the test assumptions, we observe that the normality assumption is violated (clear non-normality exhibited in Normal Q-Q plot, and Shapiro-Wilk \(p < .001\)).

While we could rely on the CLT here, due to the large sample size, the skewed histograms of the THG levels for both the male and female lionfish suggest that the mean is not the most appropriate measure of location to use, so a \(t\)-test may not be appropriate in this context.

Since the test results are not statistically significant, we do not conduct further analyses using e.g. the Mann-Whitney U test. If you choose to do so, you will observe the test results are also not statistically significant (see output below):


5 Pea Plant Data 🌳

No answer required.

Background Information

For the experiment, each pea plant seedling was assigned to one of three groups, and then carefully sprayed:

  • C: a control group, were sprayed with water
  • TA: a treatment group, were sprayed with a 25mg/L solution of GA
  • TB: a treatment group, were sprayed with a 50mg/L solution of GA
<span style='font-size:10px;'>Pea Plant Raw Data</span>

Figure 5.1: Pea Plant Raw Data

5.1

No answer required.

5.2

The appropriate test to use is a one-sample \(t\)-test, with \(H_0: \mu = 280\) versus \(H_1: \mu \neq 280\), where \(\mu\) denotes the true average height of the dwarf pea plant seedlings at the relevant time after planting.

A one sample \(t\)-test was conducted to determine if the mean height of the dwarf pea plant seedlings, at a specified time after planting, was different from 280 mm.

The sample mean height was greater than 280 (\(M = 296.618\) mm, \(SD = 106.030\) mm, \(n=85\)). However, the one sample \(t\)-test showed that this difference was not statistically significant at the \(\alpha = 0.05\) level of significance, with \(t(84) = 1.445\), \(p = 0.152\). The effect size was also negligible, with \(d = 0.157\), suggesting the result had no real clinical significance.

Checking the assumptions of the test, we observe that the data exhibits some non-normality, but overall we do not have sufficient evidence to reject the assumption of normality (Shapiro-Wilk \(p = 0.05\)).

5.3

The initial test we would use here is an independent samples \(t\)-test. If the assumptions of the test were violated, we could instead use a Mann-Whitney U test.

For the comparison of groups C and TA, we would have \(H_0: \mu_C = \mu_{TA}\) versus \(H_1: \mu_C \neq \mu_{TA}\), where \(\mu_C\) denotes the true average height of dwarf pea plant seedlings in the control group, at the relevant time after planting, and \(\mu_{TA}\) denotes the true average height of dwarf pea plant seedlings in treatment group TA, at the relevant time after planting.

5.4

For the comparison of groups TA and TB, we would have \(H_0: \mu_{TA} = \mu_{TB}\) versus \(H_1: \mu_{TA} \neq \mu_{TB}\), where \(\mu_{TB}\) denotes the true average height of dwarf pea plant seedlings in treatment group TB, at the relevant time after planting.

5.5

A two sample \(t\) test was conducted to compare the mean seedling height of dwarf pea plants given treatments C and TA. Since the equal variances assumption is violated (\(F = 15.254\), \(p\)-value \(< .001\)), we use the Welch’s \(t\)-test test statistic of \(-10.770\), with \(df = 47.107\). This has a corresponding \(p\)-value of \(< .001\). As a result, for this test we can reject \(H_0\) at the \(0.05\) level of significance.

The assumption of normality is not violated, as shown by the large Shapiro-Wilk \(p\)-value \((p = 0.939)\) and the Normal Q-Q plot.

A two sample \(t\) test was also conducted to compare the mean seedling height of dwarf pea plants given treatments TA and TB. Since the equal variances assumption is satisfied (\(F = 1.814\), \(p\)-value \(= 0.184\)) in this instance, we use the Student’s \(t\)-test test statistic of \(0.448\), with \(df = 55\). This has a corresponding \(p\)-value of \(0.656\). As a result, for this test we cannot reject \(H_0\) at the \(0.05\) level of significance.

For this analysis, the assumption of normality was also not violated, as shown by the large Shapiro-Wilk \(p\)-value \((p = 0.530)\) and the Normal Q-Q plot.


References

Green, P. T. (1997). Red crabs in rain forest on Christmas Island, Indian Ocean: activity patterns, density and biomass. Journal of Tropical Ecology, 13(1), 17-38

Johnson, E.G., Dichiera, A., Goldberg, D., Swenarton, M. and Gelsleichter, J. (2021). Total mercury concentrations in invasive lionfish (Pterois volitans/miles) from the Atlantic coast of Florida. PLOS ONE 16(9): e0234534. https://doi.org/10.1371/journal.pone.0234534


These notes have been prepared by Rupert Kuveke and other members of the Department of Mathematical and Physical Sciences. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with the Department of Environment and Genetics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.