library(lme4)
library(lmerTest)
## ReadME
# this is the function for
sim_data <- function(n, eff_size, alpha) {
s <- eff_size # how much each factor is adding to the current effect of thing
f1 <- factor(sample(c("CA", "RA", "MA"), n, TRUE)) # Activity
f2 <- factor(sample(c("OER", "DDR"), n, TRUE)) # Barrier
f3 <- factor(sample(c("Text", "Voice"), n, TRUE)) # Examples
f4 <- factor(sample(c("Open", "Guided"), n, TRUE)) # BiggerGoal
f5 <- factor(sample(c("NEX", "EX"), n, TRUE)) # Expectations
# prior generations at the level of eff_size, or in other words, standardized cohen's D
# distance,
mu_1 <-
ifelse(f1 == 'MA', s, 0) # if f1 is MA then have effect of s, else 0
mu_2 <- ifelse(f2 == 'OER', 0, s) # similar below
mu_3 <- ifelse(f3 == 'Text', 0, s)
mu_4 <- ifelse(f4 == 'Open', 0, s)
mu_5 <- ifelse(f5 == 'NEX', 0, s)
# simulating commitment scores pre/post study
commit_0 <-
pmax(1, pmin(5, round(rnorm(n, 3, 1 ^ 2)))) # simulating initial distribution of ratings, with mean 3 and standard deviations 1, if the values exceeds 5 then assign it to be 5, if the values is less than 1 assign it to be 1.
# above distribution takes into consideration that the major population centers around 3 rating, can change if needed.
random_effect <-
rnorm(n, 0, 0.5 ^ 2) # random effect for each participant
commit_1 <-
commit_0 + mu_1 + mu_2 + mu_3 + mu_4 + mu_5 + random_effect # vector-wise addition
commit_1 <- pmax(1, pmin(5, round(commit_1)))
baseline_data <- data.frame(
id = factor(1:n),
Time = "Baseline",
f1,
f2,
f3,
f4,
f5,
Commitment = commit_0
)
postintervention_data <- data.frame(
id = factor(1:n),
Time = "PostIntervention",
f1,
f2,
f3,
f4,
f5,
Commitment = commit_1
)
# Combine them into one data frame
dat <- rbind(baseline_data, postintervention_data)
model <-
lmer(Commitment ~ Time * (f1 + f2 + f3 + f4 + f5) + (1 |id), data = dat) # fitting a longitudinal study
anova_table <- anova(model)
result <- anova_table[grep("Time:f", rownames(anova_table)), ]
all_p_less_than_0_01 <- all(result[, "Pr(>F)"] < alpha/5) # adjust for 5 hypothesis testing we have, activity, barrier, examples, biggergoal, and expectations
return(all_p_less_than_0_01)
}Power_Analysis_micro_goal
Power Analysis and Assumptions:
This is a 3*2*2*2*2 between group factorial experiment, some references I have so far been reading:
References:
I highly recommend for people who haven’t done a lot of power analysis to go through some of the references I have below, they were really helpful for me to understand the whole picture and come up with the idea of doing it manually instead of a black-box on G*Power
- How to unify a general sample size instead of per-cell sample size:
- http://jakewestfall.org/blog/index.php/2015/05/26/think-about-total-n-not-n-per-cell/
- Sample size and p-hacking issues, what might arise and how to solve them:
- http://datacolada.org/17
- How to keep account of interaction effect, in a factorial design:
- https://approachingblog.wordpress.com/2018/01/24/powering-your-interaction-2/
- Test case on a 2*2 factorial experiment power analysis run by R: A lot of coding was adapted from here and changed based on our choice
- https://www.markhw.com/blog/power-twoway
- understanding factorial design and statistical analysis:
- https://www.youtube.com/watch?v=uaWEQj18zqI
- How to choose an ambiguous cohen-d value:
https://www.spss-tutorials.com/cohens-d/
https://www.youtube.com/watch?v=GDe4M0xEghs
- There’s many more stackoverflow pages for me to look into to solve bugs but I don’t keep account of those anymore
- Simulating a factorial design experiment:
https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-019-0883-9
https://debruine.github.io/faux/
https://stats.stackexchange.com/questions/57642/simulating-responses-from-a-factorial-experiment-for-power-analysis
https://rdrr.io/cran/faux/man/sim_design.html
- Thanks to Norm’s coding I realized where this code could be further refurbished:
- that is to look at directly the change of scores,
- as well as to model the with a personalized intercept
- relax the assumption on score change further more
- adjust the alpha and power constraints… I am still not very sure if this is the way to do, as Norm suggested an adjustment on FWER (family wise error rate), but in my opinion we are only looking at whether or not one factor is significant? or do we want to look at all of them?
- taken into account that change is suppose to be a discrete number :D Thank you Norm for pointing it out :D
let’s give this a closer look:
Coding Step by Step:
Step 1: Simulation Function:
Let’s figure out in this step, how to simulate such data using three settings:
- low effect size, with a Cohen’s D = 0.3,
- let me give a concrete example here, I will model this as of percentage change in the total score possible: that is
- suppose we have a potential of 25 points as in pre and post survey question (1 to 5 for all questions), that is
- the absolute change is within range 0 to 20 (-20 to 20)
- then I will apply a 0.2 to this change that is 20*0.2 = 4 as the mean score for the main change
- to cover the whole duration of -20 to 20 points, I will set a standard deviation of 5% of total score (5% * 20 = 1) as in the normal scale.
- people can revisit this assumption part further more
- I would like to take something as Norm suggested as a main assumption for the score calculation, but I think that is making things more complicated, but let me put the main assumption function here that Norm suggested, if I get further understanding of how you will assume each factor will perform, I can also incorporate this into the model for a further clarification
- main assumption Norm suggested: mainassumptions = “Commitment0 + rfx1 + 1*(Activity == 3) + 1*(Barrier == 2) + 1*(Examples == 2) + 1*(BiggerGoal == 2) + 1*(Expectations == 2)”
- let me give a concrete example here, I will model this as of percentage change in the total score possible: that is
- medium effect size, with a Cohen’s D = 0.5
- Large effect size, with a Cohen’s D = 0.7
Let me reflect this newly added simulation scale below:
P.S. I have tested multiple cases (around ~15), and I think the model converges under all the test cases I run, but I am not CS majored and I hate writing test cases, please let me know if there are things that it doesn’t work out :D
Step 2: Get-Power Function:
Now we write the function to calculate the power for such simulated model:
# it takes the mean value for the number of times the model is signifcant in the safest way when a rep number of data we simulated
get_power <- function(n, eff_size, reps, alpha) {
mean(
sapply(1:reps, function(placeholder) {
sim_data(n, eff_size, alpha)
})
)
}Step 3: Step-wise Sample Size Power analysis function:
Now we want to extend this to a range of sample size settings, for example we start from sample size 100, end at 400, and each time update by 25, something like this, so that it is easier for people to run the code…instead of hitting their head and solving everything :D
power_analysis <- function(eff_size, reps, start, end, by, alpha) {
set.seed(20552034) # oh I have good ways finding seed, see if you can crack the mistery, again a new interesting one :D
out <- lapply(
seq(start, end, by),
get_power,
eff_size, reps, alpha
)
out <- as.data.frame(do.call(rbind, out))
names(out) <- "Interaction Term Power"
out$`Sample Size` <- seq(start, end, by)
return(
out[, c(2, 1)]
)
}Result and Conclusion
I will formulate this section into 3 sections, as described before, I used a rep number of 200, and I think my computer is already dying running the simulation…for whoever have a better computer, try increase the rep number larger, although 200 rep is quite good enough
remember under a Bonferroni correction, we are looking for a power of:
power = 0.8
Results: Small Effect Size: 0.3
power_analysis(eff_size = 0.3,
reps = 200,
start = 200,
end = 300,
by = 25,
alpha = 0.05) Sample Size Interaction Term Power
1 200 0.815
2 225 0.900
3 250 0.905
4 275 0.955
5 300 0.990
So for a small effect size of 0.3, we will need around 200-225 participants
Results: Medium Effect Size: 0.5
power_analysis(eff_size = 0.5,
reps = 200,
start = 100,
end = 200,
by = 25,
alpha = 0.05) Sample Size Interaction Term Power
1 100 0.475
2 125 0.650
3 150 0.850
4 175 0.890
5 200 0.940
so for a medium effect size 0.5, we will need around 150-175 participants
Results: Large Effect Size: 0.7
power_analysis(eff_size = 0.7,
reps = 200,
start = 100,
end = 200,
by = 25,
alpha = 0.05) Sample Size Interaction Term Power
1 100 0.255
2 125 0.415
3 150 0.575
4 175 0.745
5 200 0.825
So for a large effect size 0.7, we will need around 200 participants (similarly)
Summary
Now to summary all of these, let’s create a chart
Below are results for the model lmer ~Time*( factor1 + factor2 + factor3 + factor4 + factor5):
| Effect Size (or standardized Cohen’s Distance) | Optimal Sample Size | Power expected |
|---|---|---|
| Small Effect size = 0.3 | 200-225 | 81.5 - 90.0% |
| Medium Effect size = 0.5 | 150-175 | 85.0% - 89.0% |
| Large Effect Size = 0.7 | 200 | 82.5% |