Topic 5: Hypothesis Testing


🏡 In Topic 5 we introduced an important statistical technique - hypothesis testing. In this computer lab, we will practice different aspects of hypothesis testing.

If you have time during today’s lab, you may like to work on Quiz 6. For the final question of the quiz, refer back to Computer Lab 4 if you need.

1 Carrying out a one-sample \(t\)-test in jamovi

💻 For this question, we will be assessing the wonions data set on White Imperial Spanish onion plants in the sm R package (Bowman and Azzalini 2021). This data set consists of 84 observations, and contains two variables of interest: Yield (in grams per plant), and Density of planting (in plants per square metre). In this question, we will start by focussing our attention on the Yield variable.

Download the file wonions.csv from the LMS, and save it in a relevant location on your PC.

Once you have done so, import the wonions.csv file in jamovi. For revision on how to do this, see Computer Lab 1.

Next, watch the following video, which will assist you in answering the rest of the questions in this computer lab.

1.1

💻 Create a table of descriptive statistics for the Yield variable, and comment on any details you find noteworthy. Make sure to include the Shapiro-Wilk test in your descriptives table. 💬

1.2

💻 According to jamovi, what type of variable are each of the following?

  • Yield 💬
  • Density 💬
  • Locality 💬

1.3

💻 Create each of the following plots for the Yield variable:

  • Histogram with density overlaid
  • Boxplot 💬
  • Q-Q plot

1.4

🏡 Referring to your histogram, Q-Q plot, and Shapiro-Wilk test results, do you have any concerns about the normality assumption? Explain.

Also remember that it is the distribution of the sample mean which is assumed to be normal, rather than the data itself. If you have any questions about this, discuss with your classmates and/or computer lab demonstrator

🎧 Online students 💬 Enter your answer at the relevant location of the shared google doc.

1.5

🏡 Suppose that previous data has suggested that the average Yield in White Imperial Spanish onions is \(115\) grams per plant. For our sample data, we know from our descriptives table that the yield sample mean is \(119.7\) grams per plant. Therefore, we would like to test if the average Yield in White Imperial Spanish onions is actually different from \(115\) grams per plant.

Clearly define \(\mu\) and the null and alternative hypotheses for this \(t\)-test, adhering to the statistical notation introduced in Topic 5.

Hint:

\(H_0: \mu = ...\) versus \(H_1: \mu \text{ } ... \text{ } ...,\)

where:

  • \(\mu\) denotes the population … .
🎧 Online students 💬 Enter your answer at the relevant location of the shared google doc.

1.6

🏡 Using the sample mean and sample standard deviation values provided in your descriptives table, along with other relevant information, calculate the test statistic for the \(t\)-test.

Hint: Take a look at Section 3.1 of the Topic 5 readings for details on how to calculate this test statistic.

🎧 Online students 💬 Enter your answer at the relevant location of the shared google doc.

1.7

💻 Carry out the one-sample \(t\)-test in jamovi. 💬

1.8

🏡 What is the test statistic value? Does this value match the value you calculated by hand in 1.6? 💬

1.9

🏡 What are the degrees of freedom for this test? How are they calculated? 💬

1.10

🏡 What is the \(p\)-value, and what does this number represent? 💬

1.11

🏡 What is the \(95\%\) confidence interval for \(\mu\)? What does the \(95\%\) value tell us about the level of significance \(\alpha\) used in this \(t\)-test? 💬

1.12

🏡 Based on the \(p\)-value you have obtained, what decision should we make for our hypothesis test? Make sure to explain your reasoning clearly. 💬

1.13

🏡 Based on the \(95\%\) confidence interval you have obtained, what decision should we make for our hypothesis test? Does this decision align with your conclusion in 1.12? 💬

1.14

🏡 Interpret the \(95\%\) confidence interval you have obtained, and explain in layman’s terms what the interval tells us. 💬

1.15

🏡 Write a short conclusion summarising the test and findings. 💬

2 Carrying out a one-sample \(t\)-test in jamovi (Density variable)

💻 In this question, we are interested in the planting Density variable from the wonions data set, which is the number of plants per m\(^2\).

Suppose that previous data has suggested that the average Density in White Imperial Spanish onions is 80 plants per m\(^2\). Further suppose we would like to test if the average Density is actually less than 80 plants per m\(^2\).

Repeat Question 1, but this time with respect to the Density variable.

3 Assessing Normal Q-Q plots

🏡 We will encounter examples of both “good” and a “bad” Normal Q-Q plots. While some plots are relatively easy to assess, sometimes it can be difficult to distinguish between an acceptable Normal Q-Q plot, and one that shows a violation of the assumption of normality.

Assess the Normal Q-Q plots below, and try to identify which (if any) correspond to data sampled from a normal distribution. Give reasons for your decision for each plot.

🎧 Online students 💬 Enter your answer on the shared google doc.


Well done, that’s everything for today! If you still have time, you may like to have a go at Quiz 6, which is based on the Topic 6 readings.

Before you finish up, remember to save your work (e.g. your jamovi and Word files) somewhere safe (e.g. OneDrive) so that you can access it at a later time.


References

Bowman, A. W., and A. Azzalini. 2021. R Package sm: Nonparametric Smoothing Methods (Version 2.2-5.7). University of Glasgow, UK; Università di Padova, Italia. http://www.stats.gla.ac.uk/~adrian/sm/.


These notes have been prepared by Amanda Shaker and Rupert Kuveke. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.