Topic 11: Statistical Power and Sample Size Calculation


Congratulations, you have reached the final week! Topic 11, our final topic, focuses on determining the appropriate sample size for a statistical study. In this computer lab, we will cover how to use the statistical software G*Power to carry out sample size calculations.


1 G*Power

In this computer lab, we won’t be using R or jamovi for our analyses. Instead, we will use G*Power.

G*Power is a freely available software program (Faul et al. 2007), see also Faul et al. (2009), that can be used for statistical power analyses for many different types of statistical tests.

If you are completing this computer lab on campus, G*Power should already be installed on the computer lab computers. If you are completing this computer lab on a personal computer, the STM1001 LMS will contain details on how to access and download G*Power. As a quick gude, G*Power can be installed on a Windows-based computer via the following steps:

  • Step 1: Go to the G*Power website here
  • Step 2: Scroll down to the “Download” heading, and read the terms of use.
  • Step 3: Click on the appropriate Download link.

1.1 Overview

Let’s take a look at how to use G*Power for our sample size calculations

If we open up G*Power, we will see the interface shown below in Figure 1.1. This looks a bit different to the RStudio interface we have been using throughout the semester, but don’t worry, we will cover all the necessary steps you will need to take in order to complete your G*Power sample size calculations.

G*Power Start Screen.

Figure 1.1: G*Power Start Screen.

G*Power allows for five different approaches for carrying out power analyses. For this computer lab, we will focus solely on a priori analyses. An a priori analysis is normally done at the beginning of a study, as part of the study design. In other words, these are analyses where we have a specified \(\alpha\) value, power value, and effect size, and would like to determine an appropriate sample size.

If you navigate to the Type of power analysis section in G*Power (highlighted in green in Figure 1.2 below), you will see that the first option on the list is A priori: Compute required sample size - given \(\alpha\), power, and effect size - this is the option we want!

G*Power Types of Power Analysis

Figure 1.2: G*Power Types of Power Analysis

1.2 One Sample \(t\)-test

In section 2.0.1 of Topic 11, we considered the design of a study to determine whether the average cholesterol level of patients from a particular population was different from 5.0 mmol/L. As part of the design, a sample size calculation for a one-sample \(t\)-test was conducted, with the required sample size of the study found to be \(n=12\).

1.2.1

Let’s now use G*Power to carry out this calculation - just follow the guide below.

For this test, we have the following details:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.05\).
  • The null hypothesis value is \(5\).
  • A mean difference of at least \(0.5\) is considered meaningful.
  • The estimated standard deviation for the population is \(0.55\).
  • We would like a sample size that ensures a power of at least \(0.8\).

First, we need to select the appropriate Statistical test - namely, Means: Difference from constant (one sample case) as highlighted below, in Figure 1.3.

One sample $t$-test G*Power setup.

Figure 1.3: One sample \(t\)-test G*Power setup.

Next, we need to determine the effect size. To do so, we first need to input our values.

Take a look at Figure 1.4.

  • First, we click the Determine => button (highlighted in pink). This opens up the calculations box on the right-hand-side.
  • Then we fill in the appropriate numbers in the sections highlighted in green (note that we choose Tails = Two, since we are carrying out a two-sided test).
  • Once these are completed, click the Calculate and transfer to main window option (highlighted in blue), to calculate the appropriate effect size, and transfer this value to the main panel.
One sample $t$-test G*Power calculations.

Figure 1.4: One sample \(t\)-test G*Power calculations.

Notice in Figure 1.5 that the effect size values (highlighted in green) have now been automatically filled in.

One sample $t$-test G*Power calculations.

Figure 1.5: One sample \(t\)-test G*Power calculations.

Finally, click the Calculate button (highlighted in orange in Figure 1.5) to perform the sample size calculation. This should lead you to Figure 1.6 below.

One sample $t$-test G*Power output.

Figure 1.6: One sample \(t\)-test G*Power output.

Notice in the Output Parameters section that we now have our required sample size value, namely \(n = 12\).

1.2.2

Let’s now repeat our calculations from 1.2.1, but this time using the new values presented below:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.1\).
  • The null hypothesis value is \(5\).
  • A mean difference of at least \(0.25\) is considered meaningful.
  • The estimated standard deviation for the population is \(0.45\).
  • We would like a sample size that ensures a power of at least \(0.8\).

Using these values, what is the required sample size now?

1.2.3

Suppose that a sample size of at least 30 is required for your design. What parameters could be changed to achieve this? See if you can meet this requirement using G*Power.

To start, you could compare your sample size results for 1.2.1 and 1.2.2, and consider how the different input parameters changed your results.

Note that there are several ways in which this sample size could be attained.

Hint: You could try changing just one parameter at a time, from the value specified in 1.2.1 to the value specified in 1.2.2, and note the effect this has on the sample size result. For example, what was the effect of increasing your \(\alpha\) value?

1.3 Independent samples \(t\)-test

In section 2.0.2 of Topic 11, we considered the design of a study to determine whether the average cholesterol level was significantly different between two different groups of patients (Group A and Group B). As part of the design, a sample size calculation for an independent samples \(t\)-test was conducted, with the required sample size of the study found to be \(n=108\).

1.3.1

Let’s now use G*Power to carry out this calculation - just follow the guide below.

For this test, we have the following details:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.05\).
  • The Group A mean is \(5\).
  • A difference in mean between groups of at least \(0.3\) is considered meaningful.
  • The estimated standard deviation for both groups is \(0.55\).
  • We would like a sample size that ensures a power of at least \(0.8\).

First, we need to select the appropriate Statistical test - namely, Means: Difference between two independent means (two groups) as highlighted below, in Figure 1.7.

Independent samples $t$-test G*Power setup.

Figure 1.7: Independent samples \(t\)-test G*Power setup.

Then, we need to input our values.

Take a look at Figure 1.8.

  • First, we click the Determine => button (highlighted in pink). This opens up the calculations box on the right-hand-side (just as we did in 1.2).
  • Then we fill in the appropriate numbers in the sections highlighted in green (note that we choose Tails = Two, since we are carrying out a two-sided test).
  • Once these are completed, click the Calculate and transfer to main window option (highlighted in blue), to calculate the appropriate effect size, and transfer this value to the main panel. Note that this is why the Effect size d box in the main panel is filled in.

Note: We have set the Mean group 2 value to be 5.3, as this is 0.3 greater than the Mean group 1 value.

Independent samples $t$-test G*Power calculations.

Figure 1.8: Independent samples \(t\)-test G*Power calculations.

Finally, click the Calculate button (highlighted in orange in Figure 1.8) to perform the sample size calculation. This should lead you to Figure 1.9 below.

Independent samples $t$-test G*Power output.

Figure 1.9: Independent samples \(t\)-test G*Power output.

Notice in the Output Parameters section that we now have our required sample size values, namely \(n_1 = 54\), and \(n_2 = 54\), for a total sample size of \(108\).

1.3.2

Let’s now repeat our calculations from 1.3.1, but this time using the new values presented below:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.1\).
  • The Group A mean is \(5\).
  • A difference in mean between groups of at least \(0.4\) is considered meaningful.
  • The estimated standard deviation for both groups is \(0.35\).
  • We would like a sample size that ensures a power of at least \(0.8\).

Using these values, what is the required sample size now?

1.3.3

Suppose that due to resource constraints, a maximum total sample size of \(80\) is required for the study. What parameters could be changed to achieve this? See if you can meet this requirement using G*Power.

Hint: Try following the details provided in 1.2.3.

1.4 Paired \(t\)-test

In section 2.0.3 of Topic 11, we considered the design of a study to determine whether the average difference in before and after weights of a cohort of anorexia patients was statistically significant. As part of the design, a sample size calculation for a paired \(t\)-test was conducted, with the required sample size of the study found to be \(n=50\).

1.4.1

Let’s now use G*Power to carry out this calculation - just follow the guide below. By now, you should be getting familiar with the process.

For this test, we have the following details:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.05\).
  • The estimated mean starting weight is \(36.6\) kgs.
  • An average change in weight between before and after weights of at least \(1.1\) kg is considered meaningful.
  • The estimated standard deviations for the before and after weights are \(2.1\) and \(3.5\) respectively.
  • The estimated correlation is \(0.63\).
  • We would like a sample size that ensures a power of at least \(0.8\).

First, we need to select the appropriate Statistical test - namely, Means: Difference between two dependent means (matched pairs).

Next, we need to fill in the appropriate values, as shown in Figure 1.10 below.

Paired $t$-test G*Power calculations.

Figure 1.10: Paired \(t\)-test G*Power calculations.

Once we are happy with our inputs, and click the Calculate button, we obtain our required sample size value of \(n = 50\), as shown below in Figure 1.11.

Paired $t$-test G*Power output.

Figure 1.11: Paired \(t\)-test G*Power output.

1.4.2

Let’s now repeat our calculations from 1.4.1, but this time using the new values presented below:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.1\).
  • The estimated mean starting weight is \(38.6\) kgs.
  • An average change in weight between before and after weights of at least \(1.5\) kg is considered meaningful.
  • The estimated standard deviations for the before and after weights are \(1.8\) and \(3.2\) respectively.
  • The estimated correlation is \(0.60\).
  • We would like a sample size that ensures a power of at least \(0.8\).

Using these values, what is the required sample size now?

1.4.3

Suppose that due to resource constraints, a maximum total sample size of \(40\) is required for the study. What parameters could be changed to achieve this? See if you can meet this requirement using G*Power.

1.5 Two-sample test of proportions

In section 2.0.4 of Topic 11, we considered the design of a study to determine whether there was a significant difference in the proportion of US adults who say they use Facebook, when grouping US adults into one of two groups - those aged 18-29, and those aged 30-49. As part of the design, a sample size calculation for a two-sample test of proportions was conducted, with the required sample size of the study found to be \(n=588\).

1.5.1

Let’s now use G*Power to carry out this calculation - - just follow the guide below. By now this process should hopefully feel much easier than it did at the start of the computer lab.

For this test, we have the following details:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.05\).
  • The estimated percentage of interest under \(H_0\) is 70%.
  • An observed difference of at least 10% is considered meaningful.
  • We would like a sample size that ensures a power of at least \(0.8\).

The process for this final test is a little different to the preceding tests.

Firstly, we are no longer conducting a \(t\)-test, but rather a \(z\)-test. Therefore, we need to select the z tests option in the Test family section, and then select the Proportions: Difference between two independent proportions option in the Statistical test section.

Check Figure 1.12 below if you are not sure how to proceed (the z tests option is highlighted in green).

Two-sample test of proportions G*Power setup.

Figure 1.12: Two-sample test of proportions G*Power setup.

Then, we need to fill out the appropriate values (you can keep the Allocation ratio N2/N1 box as is). Once you are happy with your inputs, click the Calculate button.

Double-check that you obtain the sample size values \(n_1 = n_2 = 294\), for a total sample size of \(n=588\).

1.5.2

Let’s now repeat our calculations from 1.5.1, but this time using the new values presented below:

  • The hypothesis test is two-tailed.
  • The significance level is \(\alpha =0.1\).
  • The estimated percentage of interest under \(H_0\) is 65%.
  • An observed difference of at least 14% is considered meaningful.
  • We would like a sample size that ensures a power of at least \(0.8\).

Using these values, what is the required sample size now?

1.5.3

Suppose that a maximum total sample size of \(500\) is required for the study. What parameters could be changed to achieve this? See if you can meet this requirement using G*Power.

2 Practice

Now that you are familiar with how to conduct sample size calculations in G*Power for different tests, redo the sample size calculations covered in questions 1.2.1, 1.3.1, 1.4.1 and 1.5.1, but this time for a desired power of at least \(0.9\).


References

Faul, F., E. Erdfelder, A. Buchner, and A. Lang. 2009. “Statistical Power Analyses Using G* Power 3.1: Tests for Correlation and Regression Analyses.” Behavior Research Methods 41 (4): 1149–60.
Faul, F., E. Erdfelder, A. Lang, and A. Buchner. 2007. “G* Power 3: A Flexible Statistical Power Analysis Program for the Social, Behavioral, and Biomedical Sciences.” Behavior Research Methods 39 (2): 175–91.


These notes have been prepared by Rupert Kuveke. The copyright for the material in these notes resides with the author named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.