Power_Analysis_micro_goal

Author

Fred

Power Analysis and Assumptions:

This is a 3*2*2*2*2 between group factorial experiment, some references I have so far been reading:

References:

I highly recommend for people who haven’t done a lot of power analysis to go through some of the references I have below, they were really helpful for me to understand the whole picture and come up with the idea of doing it manually instead of a black-box on G*Power

How to unify a general sample size instead of per-cell sample size:
1. http://jakewestfall.org/blog/index.php/2015/05/26/think-about-total-n-not-n-per-cell/
Sample size and p-hacking issues, what might arise and how to solve them:
1. http://datacolada.org/17
How to keep account of interaction effect, in a factorial design:
1. https://approachingblog.wordpress.com/2018/01/24/powering-your-interaction-2/
Test case on a 2*2 factorial experiment power analysis run by R: A lot of coding was adapted from here and changed based on our choice
1. https://www.markhw.com/blog/power-twoway
understanding factorial design and statistical analysis:
1. https://www.youtube.com/watch?v=uaWEQj18zqI
How to choose an ambiguous cohen-d value:
1. https://www.spss-tutorials.com/cohens-d/
2. https://www.youtube.com/watch?v=GDe4M0xEghs
There’s many more stackoverflow pages for me to look into to solve bugs but I don’t keep account of those anymore
Simulating a factorial design experiment:
1. https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-019-0883-9
2. https://debruine.github.io/faux/
3. https://stats.stackexchange.com/questions/57642/simulating-responses-from-a-factorial-experiment-for-power-analysis
4. https://rdrr.io/cran/faux/man/sim_design.html
Thanks to Norm’s coding I realized where this code could be further refurbished:
1. that is to look at directly the change of scores,
2. as well as to model the with a personalized intercept
3. relax the assumption on score change further more
4. adjust the alpha and power constraints… I am still not very sure if this is the way to do, as Norm suggested an adjustment on FWER (family wise error rate), but in my opinion we are only looking at whether or not one factor is significant? or do we want to look at all of them?
5. taken into account that change is suppose to be a discrete number :D Thank you Norm for pointing it out :D

let’s give this a closer look:

Coding Step by Step:

Step 1: Simulation Function:

Let’s figure out in this step, how to simulate such data using three settings:

low effect size, with a Cohen’s D = 0.3,
1. let me give a concrete example here, I will model this as of percentage change in the total score possible: that is
  1. suppose we have a potential of 25 points as in pre and post survey question (1 to 5 for all questions), that is
  2. the absolute change is within range 0 to 20 (-20 to 20)
  3. then I will apply a 0.2 to this change that is 20*0.2 = 4 as the mean score for the main change
  4. to cover the whole duration of -20 to 20 points, I will set a standard deviation of 5% of total score (5% * 20 = 1) as in the normal scale.
  5. people can revisit this assumption part further more
  6. I would like to take something as Norm suggested as a main assumption for the score calculation, but I think that is making things more complicated, but let me put the main assumption function here that Norm suggested, if I get further understanding of how you will assume each factor will perform, I can also incorporate this into the model for a further clarification
    1. main assumption Norm suggested: mainassumptions = “Commitment0 + rfx1 + 1*(Activity == 3) + 1*(Barrier == 2) + 1*(Examples == 2) + 1*(BiggerGoal == 2) + 1*(Expectations == 2)”
medium effect size, with a Cohen’s D = 0.5
Large effect size, with a Cohen’s D = 0.7

Let me reflect this newly added simulation scale below:

library(lme4)
library(lmerTest)
## ReadME
# this is the function for 
sim_data <- function(n, eff_size, alpha) {
  s <- eff_size # how much each factor is adding to the current effect of thing 
  
  f1 <- factor(sample(c("CA", "RA", "MA"), n, TRUE)) # Activity
  f2 <- factor(sample(c("OER", "DDR"), n, TRUE)) # Barrier
  f3 <- factor(sample(c("Text", "Voice"), n, TRUE)) # Examples
  f4 <- factor(sample(c("Open", "Guided"), n, TRUE)) # BiggerGoal
  f5 <- factor(sample(c("NEX", "EX"), n, TRUE)) # Expectations
  
  # prior generations at the level of eff_size, or in other words, standardized cohen's D
  # distance,
  mu_1 <-
    ifelse(f1 == 'MA', s, 0) # if f1 is MA then have effect of s, else 0
  mu_2 <- ifelse(f2 == 'OER', 0, s) # similar below
  mu_3 <- ifelse(f3 == 'Text', 0, s)
  mu_4 <- ifelse(f4 == 'Open', 0, s)
  mu_5 <- ifelse(f5 == 'NEX', 0, s)
  
  # simulating commitment scores pre/post study
  
  commit_0 <-
    pmax(1, pmin(5, round(rnorm(n, 3, 1 ^ 2)))) # simulating initial distribution of ratings, with mean 3 and standard deviations 1, if the values exceeds 5 then assign it to be 5, if the values is less than 1 assign it to be 1.
  # above distribution takes into consideration that the major population centers around 3 rating, can change if needed.
  random_effect <-
    rnorm(n, 0, 0.5 ^ 2) # random effect for each participant
  commit_1 <-
    commit_0 + mu_1 + mu_2 + mu_3 + mu_4 + mu_5 + random_effect # vector-wise addition
  commit_1 <- pmax(1, pmin(5, round(commit_1)))
  
  
  baseline_data <- data.frame(
    id = factor(1:n),
    Time = "Baseline",
    f1,
    f2,
    f3,
    f4,
    f5,
    Commitment = commit_0
  )
  
  postintervention_data <- data.frame(
    id = factor(1:n),
    Time = "PostIntervention",
    f1,
    f2,
    f3,
    f4,
    f5,
    Commitment = commit_1
  )
  
  # Combine them into one data frame
  dat <- rbind(baseline_data, postintervention_data)
  
  
  model <-
    lmer(Commitment ~ Time * (f1 + f2 + f3 + f4 + f5) + (1 |id), data = dat) # fitting a longitudinal study
  
  anova_table <- anova(model)
  
  result <- anova_table[grep("Time:f", rownames(anova_table)), ]
  
  
  all_p_less_than_0_01 <- all(result[, "Pr(>F)"] < alpha/5) # adjust for 5 hypothesis testing we have, activity, barrier, examples, biggergoal, and expectations
  
  return(all_p_less_than_0_01)
}

P.S. I have tested multiple cases (around ~15), and I think the model converges under all the test cases I run, but I am not CS majored and I hate writing test cases, please let me know if there are things that it doesn’t work out :D

Step 2: Get-Power Function:

Now we write the function to calculate the power for such simulated model:

# it takes the mean value for the number of times the model is signifcant in the safest way when a rep number of data we simulated
get_power <- function(n, eff_size, reps, alpha) {
  mean(
    sapply(1:reps, function(placeholder) {
      sim_data(n, eff_size, alpha)
    })
  )
}

Step 3: Step-wise Sample Size Power analysis function:

Now we want to extend this to a range of sample size settings, for example we start from sample size 100, end at 400, and each time update by 25, something like this, so that it is easier for people to run the code…instead of hitting their head and solving everything :D

power_analysis <- function(eff_size, reps, start, end, by, alpha) {
  set.seed(20552034) # oh I have good ways finding seed, see if you can crack the mistery, again a new interesting one :D
  out <- lapply(
    seq(start, end, by), 
    get_power, 
    eff_size, reps, alpha
  )
  out <- as.data.frame(do.call(rbind, out))
  names(out) <- "Interaction Term Power"
  out$`Sample Size` <- seq(start, end, by)
  return(
    out[, c(2, 1)]
  )
}

Result and Conclusion

I will formulate this section into 3 sections, as described before, I used a rep number of 200, and I think my computer is already dying running the simulation…for whoever have a better computer, try increase the rep number larger, although 200 rep is quite good enough

remember under a Bonferroni correction, we are looking for a power of:

power = 0.8

Results: Small Effect Size: 0.3

power_analysis(eff_size = 0.3, 
               reps = 200,
               start = 200,
               end = 300,
               by = 25,
               alpha = 0.05)

  Sample Size Interaction Term Power
1         200                  0.815
2         225                  0.900
3         250                  0.905
4         275                  0.955
5         300                  0.990

So for a small effect size of 0.3, we will need around 200-225 participants

Results: Medium Effect Size: 0.5

power_analysis(eff_size = 0.5, 
               reps = 200,
               start = 100,
               end = 200,
               by = 25,
               alpha = 0.05)

  Sample Size Interaction Term Power
1         100                  0.475
2         125                  0.650
3         150                  0.850
4         175                  0.890
5         200                  0.940

so for a medium effect size 0.5, we will need around 150-175 participants

Results: Large Effect Size: 0.7

power_analysis(eff_size = 0.7, 
               reps = 200,
               start = 100,
               end = 200,
               by = 25,
               alpha = 0.05)

  Sample Size Interaction Term Power
1         100                  0.255
2         125                  0.415
3         150                  0.575
4         175                  0.745
5         200                  0.825

So for a large effect size 0.7, we will need around 200 participants (similarly)

Summary

Now to summary all of these, let’s create a chart

Below are results for the model lmer ~Time*( factor1 + factor2 + factor3 + factor4 + factor5):

Effect Size (or standardized Cohen’s Distance)	Optimal Sample Size	Power expected
Small Effect size = 0.3	200-225	81.5 - 90.0%
Medium Effect size = 0.5	150-175	85.0% - 89.0%
Large Effect Size = 0.7	200	82.5%