Introduction

Overview: In this lab exercise, you will explore the concept of statistical power and perform sample size calculations.

Objectives: At the end of this lab you will be able to:

Part 0: Download, import dataset, and organize files

# Here, ensure to change the number "0" to a number of  
# your choosing before you proceed with this Lab.
set.seed(101)

Note here the importance of the above set.seed() function. For example, if we run the sample() function in R again and again (without the set.seed() function), it gives different set of samples each time. What if we require the same sample each time? For that we use the set.seed() function. The set.seed() function always returns the same pseudo-random sequence.

Now, we will load a dataset from the BMDpopulation.csv file into R, using the read.csv() function. Recall that the command

# Not evaluated
mydata <- read.csv("datafile.csv")

loads data from the file “datafile.csv” and stores it in an object called “mydata”. The code above is not executed by R because the option eval = FALSE is used (“eval” for “evaluate”).

In the code chunk below, load the dataset from the BMDpopulation.csv file and store it in an object called BMDpopulation.

# Enter code below
BMDpopulation <- read.csv("BMDpopulation.csv")

Notice that the dataset is now held in the object named BMDpopulation. We will convert the BMDpopulation object into the format tibble (like “table”) that we will work with, and print that object, using code similar to the following:

# Not evaluated
mydata <- as_tibble(mydata)
print(mydata)

In the code chunk below, convert the BMDpopulation object to a tibble and print it.

# Enter code below
BMDpopulation <- as_tibble(BMDpopulation)
print(BMDpopulation)
## # A tibble: 14,646 × 3
##       id sex    bone.mineral.density
##    <int> <chr>                 <dbl>
##  1     1 male                  1.04 
##  2     2 female                0.664
##  3     3 female                1.02 
##  4     4 male                  1.10 
##  5     5 male                  1.35 
##  6     6 female                0.869
##  7     7 female                1.08 
##  8     8 female                0.566
##  9     9 male                  1.01 
## 10    10 male                  0.891
## # ℹ 14,636 more rows

Recall that the inspect() command prints a summary of a data object:

# Not evaluated
inspect(mydata)

In the code chunk below, use the inspect() command to summarize the BMDpopulation dataset.

# Enter code below
inspect(BMDpopulation)
## 
## categorical variables:  
##   name     class levels     n missing
## 1  sex character      2 14646       0
##                                    distribution
## 1 female (51.4%), male (48.6%)                 
## 
## quantitative variables:  
##                   name   class   min       Q1   median        Q3       max
## 1                   id integer 1.000 3662.250 7323.500 10984.750 14646.000
## 2 bone.mineral.density numeric 0.274    0.835    0.952     1.064     1.983
##           mean           sd     n missing
## 1 7323.5000000 4228.0803564 14646       0
## 2    0.9505688    0.1808229 14646       0

You will note that the BMDpopulation dataset contains information about sex and bone.mineral.density for an entire population of 14,646 people.

Part 1: Find the true parameters of the population

In this lab, we are interested in the difference in bone mineral density between men and women. Since you have information on the entire population, you can find out the true difference in mean bone mineral density by simply finding the average for men and the average for women (and subtracting). Note that this will give us the values of the population parameters:

\(\mu_M\) = true mean bone mineral density for males

\(\mu_F\) = true mean bone mineral density for females

Now we will look at summary statistics such as mean and standard deviation for each sex group. The favstats() command provides several statistical summaries, and can be used to compare groups.

For example,

# Not evaluated
favstats(height ~ race, data = mydata)

produces the summaries for height but does so separately for Whites and non-Whites.

In the code chunck below, use the favstats() command to summarize bone.mineral.density for each level of sex.

# Enter code below
favstats(bone.mineral.density ~ sex, data = BMDpopulation)
##      sex   min      Q1 median    Q3   max      mean        sd    n missing
## 1 female 0.274 0.77975 0.8975 1.005 1.707 0.8931356 0.1720729 7532       0
## 2   male 0.379 0.90000 1.0060 1.115 1.983 1.0113767 0.1696461 7114       0

STOP! Answer Questions 1–2 now.

Part 2: Calculate required sample size

We now know that there really is a difference in the population – men have higher average bone mineral density than women. So, if we take a sample from the population and test the hypotheses:

\(H_0\): \(\mu_M = \mu_F\) (means are equal)

\(H_a\): \(\mu_M \ne \mu_F\) (means are unequal)

we should reject the null. However, we might not – and that would be a Type II error.

We can use R to calculate how many people we should sample to have a certain level of power to detect a significant difference in means (e.g., 80% power to reject \(H_0\)). We will have to specify the four factors that affect sample size:

  1. Desired power

  2. Variability (standard deviation) of the data

  3. True difference in means

  4. Chosen Type I error rate

Using the power.t.test() function, we will be able to determine the sample size needed to compare the means of the two independent populations of men and women.

For example,

# Not evaluated
power.t.test(power=0.70, sd=2, delta = 1.5, sig.level=0.05 )

produces the sample size per group (to be rounded up) for a desired power (power) of 70%, with standard deviation (sd) = 2, difference in the true population means (delta) = 1.5, and a type I error rate (sig.level) = 0.05 (or 5%), usually the default.

In the code chunk below, with a type I error rate of 0.05, variability (standard deviation) of the data equal 0.17 (from question 2), and difference in true populations means equal 0.118 (from question 1), determine the sample size per sex group necessary for 50%, 80%, and 95% power.

# Enter code below
power.t.test(power = 0.50, sd = 0.17, delta = 0.118, sig.level = 0.05)
## 
##      Two-sample t test power calculation 
## 
##               n = 16.95393
##           delta = 0.118
##              sd = 0.17
##       sig.level = 0.05
##           power = 0.5
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
power.t.test(power = 0.80, sd = 0.17, delta = 0.118, sig.level = 0.05)
## 
##      Two-sample t test power calculation 
## 
##               n = 33.56947
##           delta = 0.118
##              sd = 0.17
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
power.t.test(power = 0.95, sd = 0.17, delta = 0.118, sig.level = 0.05)
## 
##      Two-sample t test power calculation 
## 
##               n = 54.92185
##           delta = 0.118
##              sd = 0.17
##       sig.level = 0.05
##           power = 0.95
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

STOP! Answer Questions 3–5 now.

Part 3: Draw a sample from the population and perform the test

Now that we know what sample size is needed for each level of power, let’s draw samples and see what happens.

We will start with the sample size for 50% power, which you should have found was 17 people per sex group (34 total).

We can create a sample and find the summary statistics in the following lines of code

For example,

# Not evaluated
mydata.white.70 <- mydata %>% filter(race == "white") %>% sample(size=20)
mydata.nonwhite.70 <- mydata %>% filter(race == "non-white") %>% sample(size=20)

mydata.70.power <- rbind(mydata.white.70, mydata.nonwhite.50)

favstats(height ~ race, data = mydata.70.power)

produces statistical summaries of height by race in a sample of size 40 (20 people per race group, mydata.70.power) for 70% power from mydata.

Now, in a few steps, for the 50% power from BMDpopulation, create a sample of size 34 ( 17 people per sex group ) and store it under the name BMDpopulation.50.power. Then, using the favstats() commamnd, find the mean of bone.mineral.density by sex in that sample (BMDpopulation.50.power).

# Enter code below
BMDpopulation.female.50 <- BMDpopulation %>% filter(sex == "female") %>% sample(size=17)
BMDpopulation.male.50 <- BMDpopulation %>% filter(sex == "male") %>% sample(size=17)

BMDpopulation.50.power <- rbind(BMDpopulation.female.50, BMDpopulation.male.50)

favstats(bone.mineral.density ~ sex, data = BMDpopulation.50.power)
##      sex   min    Q1 median    Q3   max      mean        sd  n missing
## 1 female 0.549 0.734  0.899 1.002 1.153 0.8601176 0.1772216 17       0
## 2   male 0.724 0.950  1.020 1.123 1.355 1.0188235 0.1585699 17       0

To test if there is a difference in bone mineral density between men and women in the sample for the 50% power, we will use a two-sample t-test.

To perform the test, we will use the t.test() function with the var.equal = TRUE argument specified. This will perform the version of the two-sample t-test that assumes equal variances in the two groups.

For example,

# Not evaluated
t.test(height ~ race, data = mydata.70.power, var.equal = TRUE)

produces t-test (with equal variance) results for comparing mean height between Whites and non-Whites in the dataset for the 70% power (i.e., mydata.70.power) .

Using the obtained dataset for the 50% power (BMDpopulation.50.power), perform an equal-variance t-test to compare the means of bone.mineral.density in the groups defined by sex.

# Enter code below
t.test(bone.mineral.density ~ sex, data = BMDpopulation.50.power, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  bone.mineral.density by sex
## t = -2.7517, df = 32, p-value = 0.009682
## alternative hypothesis: true difference in means between group female and group male is not equal to 0
## 95 percent confidence interval:
##  -0.27618912 -0.04122264
## sample estimates:
## mean in group female   mean in group male 
##            0.8601176            1.0188235

STOP! Answer Question 6 now.

Now repeat the above experiment – but with a sample size of 55 people per sex group, which you should have found was the sample size needed for 95% power. Follow the steps above, but use 55 in place of 17 for the number in each group. In other words, create a dataset for 95% power (namely BMDpopulation.95.power) and perform an equal-variance t-test to compare the means of bone.mineral.density in the groups defined by sex.

# Enter code below
BMDpopulation.female.95 <- BMDpopulation %>% filter(sex == "female") %>% sample(size=55)
BMDpopulation.male.95 <- BMDpopulation %>% filter(sex == "male") %>% sample(size=55)

BMDpopulation.95.power <- rbind(BMDpopulation.female.95, BMDpopulation.male.95)

t.test(bone.mineral.density ~ sex, data = BMDpopulation.95.power, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  bone.mineral.density by sex
## t = -2.6562, df = 108, p-value = 0.009099
## alternative hypothesis: true difference in means between group female and group male is not equal to 0
## 95 percent confidence interval:
##  -0.15109702 -0.02195753
## sample estimates:
## mean in group female   mean in group male 
##            0.9203091            1.0068364
favstats(bone.mineral.density ~ sex, data = BMDpopulation.95.power)
##      sex   min    Q1 median     Q3   max      mean        sd  n missing
## 1 female 0.590 0.817  0.924 1.0230 1.438 0.9203091 0.1707312 55       0
## 2   male 0.555 0.920  1.018 1.1035 1.414 1.0068364 0.1709206 55       0

STOP! Answer Question 7 now.

For each single t-test, it’s not possible to “see” the power of the test – but if we combine the results from repeated tests, we can get an idea of how often we rejected the null.

We will use the do() function like we did in Lab 3 (Sampling) to repeat the test many times.

For example,

# Not evaluated
test.results.70.power <- do(5) *
    t.test(height ~ race,
           data = mydata %>% group_by(race) %>% slice_sample(n = 20),
           var.equal = TRUE)
test.results.70.power <- as_tibble(test.results.70.power)
print(test.results.70.power)

performs 5 equal-variance t-tests of height versus race in mydata with a randomly selected sample of 20 people per race category, stores the results in the object/dataset test.results.70.power. Then, converts the object test.results.70.power to a tibble and prints it.

There are two new commands here: group_by() and slice_sample(). You will not need to use these outside of this lab.

  • group_by(variable) indicates that whatever we do to the dataset afterward we do to observations in each group defined by the variable separately.

  • slice_sample(n = ...) is the same as sample(size = ...) but it works with group_by().

The combination of group_by(race) and slice_sample(n = 10) above means that we sample 10 people from the "white" category and 10 from the "non-white" category.

In the code chunk below, perform 1000 equal-variance t-tests of bone.mineral.density versus sex in samples from the BMDpopulation dataset with the per-group sample size you computed to yield 50% power (i.e., n = 17), and store the test results in an object called test.results.50.power. Then, convert the object/dataset (test.results.50.power) to a tibble and print it.

Note: if running 1000 tests takes a long time, start with 100 tests until you are sure your code is working, then change to 1000 tests to complete the worksheet.

# Enter code below
test.results.50.power <- do(1000) * 
  t.test(bone.mineral.density ~ sex,
         data = BMDpopulation %>% group_by(sex) %>% slice_sample(n = 17),
         var.equal = TRUE)

test.results.50.power <- as_tibble(test.results.50.power)
print(test.results.50.power)
## # A tibble: 1,000 × 11
##         t    df p.value conf.level  lower    upper method alternative data 
##     <dbl> <dbl>   <dbl>      <dbl>  <dbl>    <dbl> <chr>  <chr>       <chr>
##  1 -3.79     32 6.37e-4       0.95 -0.305 -0.0917  " Two… two.sided   bone…
##  2 -0.294    32 7.71e-1       0.95 -0.120  0.0899  " Two… two.sided   bone…
##  3 -1.68     32 1.02e-1       0.95 -0.239  0.0227  " Two… two.sided   bone…
##  4 -2.48     32 1.87e-2       0.95 -0.258 -0.0251  " Two… two.sided   bone…
##  5 -1.29     32 2.06e-1       0.95 -0.199  0.0448  " Two… two.sided   bone…
##  6 -2.14     32 3.98e-2       0.95 -0.263 -0.00665 " Two… two.sided   bone…
##  7 -3.07     32 4.35e-3       0.95 -0.294 -0.0595  " Two… two.sided   bone…
##  8 -3.75     32 6.94e-4       0.95 -0.339 -0.101   " Two… two.sided   bone…
##  9 -3.58     32 1.12e-3       0.95 -0.422 -0.116   " Two… two.sided   bone…
## 10 -2.89     32 6.92e-3       0.95 -0.262 -0.0453  " Two… two.sided   bone…
## # ℹ 990 more rows
## # ℹ 2 more variables: .row <int>, .index <dbl>

Note that the test.results.50.power dataset contains many columns. The column you will be interested in is p.value, which gives the p-value from each test.

The power of the test at the 5% significance level is the proportion of tests that have p.value < 0.05, which can be computed using the mean() function.

For example,

# Not evaluated
mean( ~ p.value < 0.05, data = test.results.70.power)

computes the power of the height versus race tests in the previous example.

In the code chunk below, compute the proportion of your tests with 50% power (i.e., test.results.50.power) that would reject the null hypothesis.

# Enter code below
mean( ~ p.value < 0.05, data = test.results.50.power)
## [1] 0.498

In the code chunk below, repeat the above process to perform 1000 equal-variance t-tests of bone.mineral.density versus sex with the sample size you computed for 95% power (i.e., n = 55). Here, store the test results in an object called test.results.95.power, convert the object/dataset (test.results.95.power) to a tibble and print it. Then, compute the proportion of your tests with 95% power that would reject the null hypothesis.

# Enter code below
test.results.95.power <- do(1000) * 
  t.test(bone.mineral.density ~ sex,
         data = BMDpopulation %>% group_by(sex) %>% slice_sample(n = 55),
         var.equal = TRUE)

test.results.95.power <- as_tibble(test.results.95.power)
print(test.results.95.power)
## # A tibble: 1,000 × 11
##        t    df   p.value conf.level  lower   upper method alternative data 
##    <dbl> <dbl>     <dbl>      <dbl>  <dbl>   <dbl> <chr>  <chr>       <chr>
##  1 -2.93   108 0.00411         0.95 -0.149 -0.0289 " Two… two.sided   bone…
##  2 -3.80   108 0.000242        0.95 -0.196 -0.0616 " Two… two.sided   bone…
##  3 -2.30   108 0.0232          0.95 -0.135 -0.0101 " Two… two.sided   bone…
##  4 -3.14   108 0.00217         0.95 -0.162 -0.0367 " Two… two.sided   bone…
##  5 -1.64   108 0.103           0.95 -0.109  0.0101 " Two… two.sided   bone…
##  6 -3.36   108 0.00107         0.95 -0.162 -0.0418 " Two… two.sided   bone…
##  7 -2.36   108 0.0199          0.95 -0.169 -0.0148 " Two… two.sided   bone…
##  8 -4.36   108 0.0000294       0.95 -0.199 -0.0748 " Two… two.sided   bone…
##  9 -2.47   108 0.0152          0.95 -0.149 -0.0163 " Two… two.sided   bone…
## 10 -4.02   108 0.000107        0.95 -0.200 -0.0679 " Two… two.sided   bone…
## # ℹ 990 more rows
## # ℹ 2 more variables: .row <int>, .index <dbl>
mean(~ p.value < 0.05, data = test.results.95.power)
## [1] 0.946

STOP! Answer Questions 8–9 now.

Please turn in your completed worksheet (DOCX, i.e., word document), and your RMD file and updated HTML file to Carmen by the due date. Here, ensure to upload all the three (3) files before you click on the “Submit Assignment” tab to complete your submission.