All analyses were conducted in R (v4.2.2; R Core Team, 2022) using the “psych” (vX.X; Author, year). “effectsize” (version, author, year).
Today we’re going to talk about effect sizes and power Let’s first set up our environment. Don’t forget to set your working directory You will need to install two new packages today: pwr and effectsize
library(pwr)
library(effectsize)
Let’s read in our data. We will use the same data set we used last week
MyData <- read.csv("Week 5 Favorability Data.csv")
You may recall that we’ve talked about t-tests and how the ‘effect size’ we get from a t-test is technically uninterpretable because it’s essentially just a t-value. In order to make this more meaningful, we can convert that t-score into an effect size, Cohen’s d. This allows us to use ‘rules of thumb’ to get a sense of the magnitude of the effect.
We will use our paired samples t-test from last class as an example Last week, we simply used the function t.test() and got our results. That is totally okay, but remember that it’s always best to save them as an object in case we want to use it again later and don’t want to retype the whole function. Let’s save it as ‘model1’. You’ll see why soon.
Let’s start with the one-sample t-test
model1 <- t.test(MyData$demFav,mu=2.5)
print(model1)
##
## One Sample t-test
##
## data: MyData$demFav
## t = -10.473, df = 2522, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 2.5
## 95 percent confidence interval:
## 2.301568 2.364155
## sample estimates:
## mean of x
## 2.332861
library(report)
report(model1)
## Warning: Missing values detected. NAs dropped.
## Effect sizes were labelled following Cohen's (1988) recommendations.
##
## The One Sample t-test testing the difference between MyData$demFav (mean =
## 2.33) and mu = 2.5 suggests that the effect is negative, statistically
## significant, and small (difference = -0.17, 95% CI [2.30, 2.36], t(2522) =
## -10.47, p < .001; Cohen's d = -0.21, 95% CI [-0.25, -0.17])
t(2522)= -10.47, p< .001; Cohen’s d = -0.21, 95% CI [-.25, -.17]
We have our t-score, but that doesn’t tell us anything about the magnitude of the effect we’re trying to observe. That’s where Cohen’s d comes in handy.
To extract Cohen’s d from a t-value, we will use the cohens_d() function using the effectsize package.
cohens_d(model1)
## Warning: Missing values detected. NAs dropped.
## Cohen's d | 95% CI
## --------------------------
## -0.21 | [-0.25, -0.17]
##
## - Deviation from a difference of 2.5.
Now let’s look at our independent samples t-test Remember that we first have to isolate the variables we want. In this case, we only want to look at Democrats and Republicans and see how they vary in how they feel about the Republican party. And remember we have to set the var.equal argument to FALSE because when we tested the variance last week, we found that they are significantly different form one another. And we also have to get the SD for each group since the t-test model only gives the means
df1 <- subset(MyData,MyData$PARTY=="Republican"|
MyData$PARTY=="Democrat") # Subsetting variables of interest
model2 <- t.test(repFav~PARTY,data=df1,var.equal= F) # Running t-test
print(model2)
##
## Welch Two Sample t-test
##
## data: repFav by PARTY
## t = -41.05, df = 1206.8, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Democrat and group Republican is not equal to 0
## 95 percent confidence interval:
## -1.302398 -1.183584
## sample estimates:
## mean in group Democrat mean in group Republican
## 1.495334 2.738325
aggregate(repFav~PARTY, df1, sd) # Getting SD for groups
## PARTY repFav
## 1 Democrat 0.5296485
## 2 Republican 0.6117943
report(model2)
## Warning: Unable to retrieve data from htest object. Returning an approximate
## effect size using t_to_d().
## Effect sizes were labelled following Cohen's (1988) recommendations.
##
## The Welch Two Sample t-test testing the difference of repFav by PARTY (mean in
## group Democrat = 1.50, mean in group Republican = 2.74) suggests that the
## effect is negative, statistically significant, and large (difference = -1.24,
## 95% CI [-1.30, -1.18], t(1206.84) = -41.05, p < .001; Cohen's d = -2.36, 95% CI
## [-2.51, -2.22])
And now we extract Cohen’s d
cohens_d(model2)
## Warning: Unable to retrieve data from htest object. Returning an approximate
## effect size using t_to_d().
## d | 95% CI
## ----------------------
## -2.36 | [-2.51, -2.22]
Now we’ll look back at our paired-samples t-test. In this case, the var.equal argument is set to TRUE because our variance analysis last week revealed that the two means were not significantly different from each other. We also have to remember to set the paired argument to TRUE, otherwise, R will automatically run it as an independent-samples t-test. And don’t forget to extract the Cohen’s d
model3 <- t.test(MyData$demFav,MyData$repFav,var.equal= T,paired= T)
print(model3)
##
## Paired t-test
##
## data: MyData$demFav and MyData$repFav
## t = 11.935, df = 2510, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.2835196 0.3950007
## sample estimates:
## mean difference
## 0.3392602
cohens_d(model3)
## Warning: Missing values detected. NAs dropped.
## Cohen's d | 95% CI
## ------------------------
## 0.24 | [0.20, 0.28]
report(model3)
## Warning: Missing values detected. NAs dropped.
## Effect sizes were labelled following Cohen's (1988) recommendations.
##
## The Paired t-test testing the difference between MyData$demFav and
## MyData$repFav (mean difference = 0.34) suggests that the effect is positive,
## statistically significant, and small (difference = 0.34, 95% CI [0.28, 0.40],
## t(2510) = 11.93, p < .001; Cohen's d = 0.24, 95% CI [0.20, 0.28])
still need to descripe with mean and sd for each variable include in t test for independent and paired EX: A paired-samples t-test showed a significant difference between Democratic politicans’ favorability (M = 2.33 , SD = .8) and Republican politicans’ favorability (M = 1.99 , SD = .8) t(2510) = 11.93, p < .001; Cohen’s d = 0.24, 95% CI [0.20, 0.28].
Now, to report the means and SD for each “group”, you can just use the describe() function
library(psych)
##
## Attaching package: 'psych'
## The following object is masked from 'package:effectsize':
##
## phi
describe(MyData$demFav)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 2523 2.33 0.8 2.44 2.35 0.93 1 4 3 -0.21 -1.09 0.02
describe(MyData$repFav)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 2514 1.99 0.8 2 1.94 0.99 1 4 3 0.42 -0.75 0.02
Annie shared the link to G*Power. However, you can also estimate power in R To do this, we will use the pwr package
POWER TELLS HOW MANY PEOPLE ARE NEEDED?
WANT POWER OF AT LEAST .8 AND ALWAYS DO 2 SIDED
pwr.t.test(d = .3, sig.level = .05, power = .8, type = "two.sample",
alternative = "two.sided")
##
## Two-sample t test power calculation
##
## n = 175.3847
## d = 0.3
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
WOULD NEED A TOTAL OF 352 PEOPLE - 176 IN EACH GROUP
Leaving the code as is, let’s us know that if we want to find an effect size of at least (AKA, Cohen’s d) .3 (since that’s a medium size effect) with enough power in a two samples t-test (two-tailed), we would need a sample size of N = 179 participants. I’m only giving you the argument for two samples because as Annie said, we wouldn’t normally want to do a one-sample since we rarely know the population mean.
You can also calculate power AFTER data collection. In this case, you already have your sample sizes in each group. It’s possible that you’ll have unequal samples sizes, but you want to know, based on your sample size, how much power you have to detect an effect size
pwr.t2n.test(n1 = 80, n2 = 64, d = .3, sig.level = .05)
##
## t test power calculation
##
## n1 = 80
## n2 = 64
## d = 0.3
## sig.level = 0.05
## power = 0.4274097
## alternative = two.sided
You can see that that’s very low power. Ideally, you’ll have a larger sample size. Also, FYI, you won’t need this particular function for Problem Set 3. This is if you ever encounter a situation when you want to know your power after you’ve collected data.
pwr.r.test(r = .3, sig.level = .05, power = .8)
##
## approximate correlation power calculation (arctangh transformation)
##
## n = 84.07364
## r = 0.3
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
So, to get a correlation of at least .3, we need at least 85 participants
A power analysis revealed that our target sample size needs to be 85 participants to detect a medium-size effect (r = .3) with .8 power.
If you want more ways to estimate power in R, you can look through the documentation of the power package. Just type ?pwr in the console to access the documentation