Motivated

In the data meeting, I was asked to conduct the analysis for CHECK data to assess the multiple scales, such as Anxiety, Post-Partum Depression, PSC-17 (Pediatric Symptom Checklist), CHAOS, Support, … and the number of health check visits. Its purpose served the next grant proposal in the power analysis.

Of course, in the randomized, there were 2 groups: standard and enrolled (having the interventions). So, one of the PIs asked “Can we do the power analyses for interaction instead of simply comparing 2 groups effect?”

library(tidyverse) #%>%
library(psych) # dummy.code
library(pwr) # pwr
library(sjPlot) # plot_model
library('snow') # makeCluster
library('paramtest') # simulate power
library(ggpubr) # theme_pubr
library('MASS') # data & mvrnorm

##winsorizing function
winsor2<-function (x, multiple=3)
{
  if(length(multiple) != 1 || multiple <= 0) {
    stop("bad value for 'multiple'")
  }
  mn <- mean(x,na.rm = T)
  uppersd<- mn + (sd(x,na.rm = T)*multiple )
  lowersd<- mn - (sd(x,na.rm = T)*multiple )
  
  x[ x > uppersd ] <- uppersd
  x[ x < lowersd ] <- lowersd
  return(x)
}

## trimming the outlier
quartile_outliers<-function(x,level=2)
{
  if(level < 1 || level > 2) {
    stop("bad value for 'level'")
  }
  
  lowerq = quantile(x)[2]
  upperq = quantile(x)[4]
  iqr = upperq - lowerq #Or use IQR(data)
  extreme.threshold.upper = (iqr * 3) + upperq
  extreme.threshold.lower = lowerq - (iqr * 3)
  mild.threshold.upper = (iqr * 1.5) + upperq
  mild.threshold.lower = lowerq - (iqr * 1.5)
  
  if(level == 2){J<-unname(c(extreme.threshold.lower,extreme.threshold.upper))}
  if(level == 1){J<-unname(c(mild.threshold.lower,mild.threshold.upper))}
  return(J)
}

Applying on real data low birth weight

Using publicly available data Low Infant Birth Weight from MASS package, I tested an interaction.

data("birthwt", package = "MASS")
names(birthwt)
##  [1] "low"   "age"   "lwt"   "race"  "smoke" "ptl"   "ht"    "ui"    "ftv"  
## [10] "bwt"

Low Infant Birth Weight

Description: The birthwt data frame has 189 rows and 10 columns. The data were collected at Baystate Medical Center, Springfield, Mass during 1986.

Format:

low: indicator of birth weight less than 2,500 gram.

age: mother’s age in years.

lwt: mother’s weight in pounds at last menstrual period.

race: mother’s race (1 = white, 2 = black, 3 = other).

smoke: smoking status during pregnancy.

ptl: number of previous premature labours.

ht: history of hypertension.

ui: presence of uterine irritability.

ftv: number of physician visits during the first trimester.

bwt: birth weight in grams.

Cleaning data for the analysis

## rename and set up the nature of variables (birthwt1)
birthwt1 <- birthwt
colnames(birthwt1) <- c("birthwt.below.2500", "mother.age", "mother.weight", "race", "mother.smokes", 
  "previous.prem.labor", "hypertension", "uterine.irr", "physician.visits", "birthwt.grams")
birthwt1$race <- factor(c("white", "black", "other")[birthwt1$race])
birthwt1$race <- relevel(birthwt1$race, ref = "white")
birthwt1$mother.smokes <- factor(c("No", "Yes")[birthwt1$mother.smokes + 1])
birthwt1$uterine.irr <- factor(c("No", "Yes")[birthwt1$uterine.irr + 1])
birthwt1$hypertension <- factor(c("No", "Yes")[birthwt1$hypertension + 1])

## rename and standardizing all variables in dataset
birthwt2 <- birthwt
colnames(birthwt2) <- c("birthwt.below.2500", "mother.age", "mother.weight", "race", "mother.smokes", 
  "previous.prem.labor", "hypertension", "uterine.irr", "physician.visits", "birthwt.grams")
mother.weight_cut <- quartile_outliers(birthwt2$mother.weight, level = 2)[2]
mother.age_cut <- quartile_outliers(birthwt2$mother.age, level = 2)[2]

birthwt2 <- filter(birthwt2, mother.weight < mother.weight_cut, mother.age < mother.age_cut)

### dummy-code categorical variables (birthwt3)
l<-dummy.code(birthwt2$race)
colnames(l)<-c("white","black","other") 
birthwt2<-cbind(birthwt2,l)

birthwt3 <- birthwt2[,-4] #### get rid of race

for(i in 1:dim(birthwt3)[2] ){ 
  birthwt3[,i]<-winsor2(birthwt3[,i])
  birthwt3[,i]<-scale(birthwt3[,i], center = T, scale = T)
}
summary(birthwt3)
##  birthwt.below.2500.V1    mother.age.V1       mother.weight.V1  
##  Min.   :-0.6797568    Min.   :-1.7659800   Min.   :-1.7860965  
##  1st Qu.:-0.6797568    1st Qu.:-0.8056379   1st Qu.:-0.6668374  
##  Median :-0.6797568    Median :-0.1333984   Median :-0.2564423  
##  Mean   : 0.0000000    Mean   : 0.0000000   Mean   : 0.0000000  
##  3rd Qu.: 1.4632052    3rd Qu.: 0.5388411   3rd Qu.: 0.4337675  
##  Max.   : 1.4632052    Max.   : 3.0710003   Max.   : 3.0542886  
##    mother.smokes.V1   previous.prem.labor.V1   hypertension.V1  
##  Min.   :-0.7925802   Min.   :-0.422810      Min.   :-0.237724  
##  1st Qu.:-0.7925802   1st Qu.:-0.422810      1st Qu.:-0.237724  
##  Median :-0.7925802   Median :-0.422810      Median :-0.237724  
##  Mean   : 0.0000000   Mean   : 0.000000      Mean   : 0.000000  
##  3rd Qu.: 1.2549186   3rd Qu.:-0.422810      3rd Qu.:-0.237724  
##  Max.   : 1.2549186   Max.   : 3.468537      Max.   : 4.183943  
##     uterine.irr.V1    physician.visits.V1    birthwt.grams.V1  
##  Min.   :-0.4198362   Min.   :-0.7874135   Min.   :-3.0026002  
##  1st Qu.:-0.4198362   1st Qu.:-0.7874135   1st Qu.:-0.7176613  
##  Median :-0.4198362   Median :-0.7874135   Median : 0.0382075  
##  Mean   : 0.0000000   Mean   : 0.0000000   Mean   : 0.0000000  
##  3rd Qu.:-0.4198362   3rd Qu.: 0.2366605   3rd Qu.: 0.7334287  
##  Max.   : 2.3690756   Max.   : 3.0569458   Max.   : 2.8170362  
##        white.V1             black.V1             other.V1      
##  Min.   :-1.0189914   Min.   :-0.7396236   Min.   :-0.3929945  
##  1st Qu.:-1.0189914   1st Qu.:-0.7396236   1st Qu.:-0.3929945  
##  Median : 0.9760865   Median :-0.7396236   Median :-0.3929945  
##  Mean   : 0.0000000   Mean   : 0.0000000   Mean   : 0.0000000  
##  3rd Qu.: 0.9760865   3rd Qu.: 1.3447701   3rd Qu.:-0.3929945  
##  Max.   : 0.9760865   Max.   : 1.3447701   Max.   : 2.5308845

Understanding Interactions (Between Categorical and Continuous Variables)

Let’s examine the impact of mother’s age (a continuous variable) and mother’s smoking status (a categorical, dummy coded variable) as our two predictors.

Here, smoking status is coded 1 for yes and 0 for none.

lm1 <- lm(birthwt.grams ~ mother.smokes, data = birthwt1)
summary(lm1)
## 
## Call:
## lm(formula = birthwt.grams ~ mother.smokes, data = birthwt1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2062.9  -475.9    34.3   545.1  1934.3 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3055.70      66.93  45.653  < 2e-16 ***
## mother.smokesYes  -283.78     106.97  -2.653  0.00867 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 717.8 on 187 degrees of freedom
## Multiple R-squared:  0.03627,    Adjusted R-squared:  0.03112 
## F-statistic: 7.038 on 1 and 187 DF,  p-value: 0.008667
lm1b <- lm(birthwt.grams ~ mother.smokes +  mother.age, data = birthwt1)
summary(lm1b)
## 
## Call:
## lm(formula = birthwt.grams ~ mother.smokes + mother.age, data = birthwt1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2119.98  -442.66    52.92   532.38  1690.74 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      2791.224    240.950  11.584   <2e-16 ***
## mother.smokesYes -278.356    106.987  -2.602    0.010 *  
## mother.age         11.290      9.881   1.143    0.255    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 717.2 on 186 degrees of freedom
## Multiple R-squared:  0.04299,    Adjusted R-squared:  0.0327 
## F-statistic: 4.177 on 2 and 186 DF,  p-value: 0.0168

We can see that, on average, smoking makes about 284 gram less in the child’s birth weight compared to non-smoking mother.

What we want to know is, does the smoking difference in birth weight differ based on the mother’s age? An interaction term will tell us.

lm2 <- lm(birthwt.grams ~ mother.smokes +  mother.age + mother.smokes:mother.age, data = birthwt1)
summary(lm2)
## 
## Call:
## lm(formula = birthwt.grams ~ mother.smokes + mother.age + mother.smokes:mother.age, 
##     data = birthwt1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2189.27  -458.46    51.46   527.26  1521.39 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  2406.06     292.19   8.235 3.18e-14 ***
## mother.smokesYes              798.17     484.34   1.648   0.1011    
## mother.age                     27.73      12.15   2.283   0.0236 *  
## mother.smokesYes:mother.age   -46.57      20.45  -2.278   0.0239 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 709.3 on 185 degrees of freedom
## Multiple R-squared:  0.06909,    Adjusted R-squared:  0.054 
## F-statistic: 4.577 on 3 and 185 DF,  p-value: 0.004068

How do we interpret these coefficients?

The interaction is statistically significant at a level of 0.02. This means that the slope (with age) among mothers that smoke is much smaller than the slope among mother’s that don’t. Indeed, among mothers that don’t smoke, for every additional year of age the average birth weight increases by 28 gram on average. Among mothers that do smoke, for every additional year of age the average birth weight decreases by 19 gram on average.

For every 1-year increase in age, a non-smoking mother should expect to have a child weighted an additional 28 gram, while a smoking mother should expect to have a child weight an diminish of quite amount of gram when the age are increasing, especially at the threshold age of 38 (i.e. 728 - 19*age).

In other words, the low birth weight for having a higher age differs for smoking and non-smoking mother.

Another way to say the same thing is that the difference in birth weight between smoking status widens as age of mother increases.

plot_model(lm2, type = "pred", terms = c("mother.age", "mother.smokes"))

How large of a sample is required about interaction analyses?

To run the analysis for the interaction analysis, I already filtered and trimmed the outlier. All the variables were standardized (mean is 0 and standard deviation is 1).

The sample size is N = 186, so DF = 162.

linear.model <- lm(birthwt.grams ~ mother.age + mother.weight + white + black + mother.smokes + previous.prem.labor + hypertension + uterine.irr + physician.visits + mother.age:mother.smokes + mother.weight*mother.smokes + (mother.age + mother.weight):(white + black + previous.prem.labor + hypertension + uterine.irr + physician.visits), data=birthwt3)
summary(linear.model)
## 
## Call:
## lm(formula = birthwt.grams ~ mother.age + mother.weight + white + 
##     black + mother.smokes + previous.prem.labor + hypertension + 
##     uterine.irr + physician.visits + mother.age:mother.smokes + 
##     mother.weight * mother.smokes + (mother.age + mother.weight):(white + 
##     black + previous.prem.labor + hypertension + uterine.irr + 
##     physician.visits), data = birthwt3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.36271 -0.53984  0.05405  0.59837  2.04707 
## 
## Coefficients:
##                                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                       -0.03265    0.07437  -0.439  0.66126    
## mother.age                        -0.07131    0.07282  -0.979  0.32890    
## mother.weight                      0.13145    0.07877   1.669  0.09711 .  
## white                              0.27081    0.12082   2.241  0.02635 *  
## black                              0.07077    0.12124   0.584  0.56019    
## mother.smokes                     -0.25977    0.07795  -3.333  0.00107 ** 
## previous.prem.labor               -0.01938    0.07688  -0.252  0.80128    
## hypertension                      -0.24371    0.07343  -3.319  0.00112 ** 
## uterine.irr                       -0.32681    0.07633  -4.282 3.17e-05 ***
## physician.visits                  -0.02843    0.06990  -0.407  0.68469    
## mother.age:mother.smokes          -0.16490    0.07818  -2.109  0.03645 *  
## mother.weight:mother.smokes        0.02652    0.07548   0.351  0.72576    
## mother.age:white                   0.03406    0.12078   0.282  0.77829    
## mother.age:black                  -0.05892    0.12305  -0.479  0.63272    
## mother.age:previous.prem.labor     0.08071    0.08534   0.946  0.34569    
## mother.age:hypertension           -0.15232    0.08940  -1.704  0.09034 .  
## mother.age:uterine.irr            -0.05627    0.07952  -0.708  0.48018    
## mother.age:physician.visits        0.07865    0.06487   1.212  0.22709    
## mother.weight:white               -0.11437    0.09524  -1.201  0.23158    
## mother.weight:black                0.02717    0.11009   0.247  0.80540    
## mother.weight:previous.prem.labor  0.08187    0.08110   1.009  0.31425    
## mother.weight:hypertension         0.03343    0.06468   0.517  0.60603    
## mother.weight:uterine.irr         -0.07409    0.07114  -1.041  0.29924    
## mother.weight:physician.visits     0.01557    0.07069   0.220  0.82592    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8793 on 162 degrees of freedom
## Multiple R-squared:  0.3229, Adjusted R-squared:  0.2268 
## F-statistic: 3.359 on 23 and 162 DF,  p-value: 3.397e-06

So we see a significant interaction of mother’s age and mother’s smoking status predicting birth weight.

Notice that I’ve included covariate-by-interaction-term interactions in the model to control for possible sources of confounding.

A power analysis

Now that we have this result, we might ask,

  • “How large a sample do I need to replicate this?” Or alternatively,

  • “Am I powered to detect this effect in the current analysis?”.

Both can be answered with a power analysis. If we run a standard power analysis as if this is a simple regression with an independent variable B = -0.16490 (the effect size of the above interaction), we would get:

pwr.r.test(r = -0.16490,power = 0.8,sig.level = 0.05)
## 
##      approximate correlation power calculation (arctangh transformation) 
## 
##               n = 285.5112
##               r = 0.1649
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided

Thus, we need n = 285 to detect the interaction effect.

Power analysis with simulation

To do a power analysis for interaction, I used a version of the sample code from the ‘paramtest’ R package vignette. The basic idea here is to simulate data with the given effect-size of interest, and to test how frequently you get a significant result with a given sample size.

As opposed to a power analysis for a regression, where only one effect-size needs to be specified, here we need four:

    1. the interaction term bXM;
    1. & 3) main effects of the two interacting variables bX & bM;
    1. the correlation (r) between X&M (rXM).

All are standardized effect sizes and adjusted for all covariates.

In the present case, bXM = -0.16490, bX = -0.07131, bM = -0.25977, and rXM = 0.5.

## main power simulation function
lm_test_interaction <- function(N, b1, b2, b3, b0=0, r=0,
                                x1m=0, x1sd=1,x2m=0, x2sd=1,
                                model_formula = c("y ~ x1:x2+x1+x2")) { 

  data = mvrnorm(n=N, mu=c(x1m, x2m), Sigma=matrix(c(x1sd, r ,r , x2sd ), nrow=2), empirical=TRUE)
  x1    = data[, 1]  # standard normal (mu=0, sd=1)
  x2    = data[, 2]  # standard normal (mu=0, sd=1)
  
  yvar <- sqrt(1 - b1^2 - b2^2 - b3^2 )  # residual variance
  ym <- (b0 + b1*x1 + b2*x2 + b3*x1*x2)
  y <- rnorm(N, ym , yvar)
  
  model <- lm(as.formula(model_formula))
  
  # pull output from model (two main effects and interaction)
  est_x1 <- coef(summary(model))['x1', 'Estimate']
  p_x1 <- coef(summary(model))['x1', 'Pr(>|t|)']
  sig_x1 <- p_x1 < .05
  est_x2 <- coef(summary(model))['x2', 'Estimate']
  p_x2 <- coef(summary(model))['x2', 'Pr(>|t|)']
  sig_x2 <- p_x2 < .05
  est_int <- coef(summary(model))['x1:x2', 'Estimate']
  p_int <- coef(summary(model))['x1:x2', 'Pr(>|t|)']
  sig_int <- p_int < .05
  
  return(c(est_x1=est_x1, p_x1=p_x1, sig_x1=sig_x1, est_x2=est_x2, p_x2=p_x2,
           sig_x2=sig_x2, est_int=est_int, p_int=p_int, sig_int=sig_int))
}

# run in parallel to speed things up
cl <- makeCluster(4)
clusterExport(cl, list("mvrnorm")) ## need to send mvnorm function to parallel clusters for this to work

power_lm_int <- grid_search(lm_test_interaction, params=list(N=seq(50,500,by=10)),
                            n.iter=10000, output='data.frame', parallel='snow',cl = cl,
                            b1=-0.07131, b2=-0.25977 ,b3=-0.16490, r=-0.5,
                            model_formula = c("y ~ x1:x2+x1+x2") )
## Running 460,000 tests...
stopCluster(cl)

Out<-results(power_lm_int) %>%
  group_by(N.test) %>%
  summarise(
    power_x1=mean(sig_x1),
    power_x2=mean(sig_x2),
    power_int=mean(sig_int), .groups = 'drop')

#the 1st ggplot
  ggplot(data = Out, aes(x = N.test, y = power_int)) +
  geom_hline(yintercept = 0.8, color="red", size=2) + 
  geom_hline(yintercept = 0.05, color="green", size=2) +    
  geom_hline(yintercept = 1, color="green", size=2) + 
  geom_vline(xintercept = 225,color="blue",size=2,linetype="dashed" ) +    
  geom_smooth(colour="black",formula = y ~ x, method = 'loess', se=F, size=1.5) +
  geom_line() +
  geom_point(size=3) +
  xlab(  c("Sample Size")  )  + ylab(c("Power") ) +
  scale_x_continuous(breaks=seq(0,500,by = 50)) +
  scale_y_continuous(breaks=seq(0,1,by = 0.2)) +
  theme_pubr() +
  ggtitle(label = "Simulated Data") +
  font("xylab",size=15) + font("xy",size=15)+ font("xy.text", size = 15) +  font("legend.text",size = 15) +
  theme(
    axis.ticks = element_line(size=2,color="black"),
    axis.ticks.length=unit(0.2,"cm"),
    axis.line.x = element_line(color = "black",size=1.5),
    axis.line.y = element_line(color = "black",size=1.5),
    legend.position=c(.86,.87),
    legend.key=element_rect(size=2),
    legend.key.size = unit(1.5, "lines"))

Boostrapping on actual data to detect the significant effect

Is this power analysis correct? One way of testing this is to ask how likely we would be to detect a significant effect if we had only analyzed a subset of our actual data.

What I did was bootstrap new samples from the original data set. Then I ran the interaction analysis, and tested how frequently p<0.05 was observed.

# Bootstrapped power analysis using the actual data

## bootstrapping function
boot_dass<-function(N,dat = birthwt3){
  sub<-sample(c(1:dim(dat)[1]),size = N,replace = T)
  
  birthwt_t<-dat[sub,]
  
  fit<-lm(birthwt.grams ~ mother.age + mother.weight + white + black + mother.smokes + previous.prem.labor + hypertension + uterine.irr + physician.visits + mother.age:mother.smokes + mother.weight*mother.smokes + (mother.age + mother.weight):(white + black + previous.prem.labor + hypertension + uterine.irr + physician.visits), data=birthwt_t)
  
  est_int <- coef(summary(fit))['mother.age:mother.smokes', 'Estimate']
  p_int<-coef(summary(fit))['mother.age:mother.smokes', 'Pr(>|t|)']
  sig_int <- p_int < .05
  return(c(est_int=est_int, p_int=p_int, sig_int=sig_int))
}

iter = 10000
Ns = seq(50,500,by=10)
count=1
empirical_boot_out<-as.data.frame(matrix(0,(iter*length(Ns)),4))


for(n in 1:length(Ns)){
  for(i in 1:iter){
    out<-c(Ns[n],boot_dass(Ns[n]))
    empirical_boot_out[count,]<-out
    count<-count+1
  }
}

colnames(empirical_boot_out)<-c("N","B","p","sig")

Out<-empirical_boot_out %>%
  group_by(N) %>%
  summarise(power_int=mean(sig), .groups = 'drop')

# 2nd ggplot
  ggplot(data = Out,aes(x = N,y = power_int))+
  geom_hline(yintercept = 0.8,color="red",size=2)+ 
  geom_hline(yintercept = 0.05,color="green",size=2)+    
  geom_hline(yintercept = 1,color="green",size=2)+
  geom_vline(xintercept = 300,color="blue",size=2,linetype="dashed" )+    
  geom_smooth(colour="black",formula = y ~ x,method = 'loess',se=F,size=1.5)+
  geom_line()+
  geom_point(size=3)+
  xlab(  c("Sample Size")  )  + ylab(c("Power") ) +
  scale_x_continuous(breaks=seq(0,500,by = 50)) +
  scale_y_continuous(breaks=seq(0,1,by = 0.2)) +
  theme_pubr()+
  ggtitle(label = "Real Data - Bootstrapped")+
  font("xylab",size=15)+  font("xy",size=15)+ font("xy.text", size = 15) +  font("legend.text",size = 15) +
  theme( 
    axis.ticks = element_line(size=2,color="black"),
    axis.ticks.length=unit(0.2,"cm"),
    axis.line.x = element_line(color = "black",size=1.5),
    axis.line.y = element_line(color = "black",size=1.5),
    legend.position=c(.86,.87),
    legend.key=element_rect(size=2),
    legend.key.size = unit(1.5, "lines"))

The results (below) are non-similar to our simulations (above). This tells us that the parameter (correlation between age and smoking may not be correct). But, the estimate sample size from ‘pwr’ is quite match with the bootstrapping in real data.

Effects of correlation

Our power analysis says N=275-300 for 80% power, to gain the same effect with the real data. So, in the final step, I decided to run simulations on the correlation between 2 main effects (rXM).

Is the correlation the real reason affect our simulation above?

#simulations looking at effects if varying the correlation between X and M
cl <- makeCluster(4)
clusterExport(cl, list("mvrnorm")) ## need to send mvnorm function to parallel clusters 

power_lm_int <- grid_search(lm_test_interaction, params=list(N=seq(50,500,by=50),r=seq(0,-0.8,by=-0.2)),
                            n.iter=1000, output='data.frame', parallel='snow',cl = cl,
                            b1=-0.07131, b2=-0.25977 ,b3=-0.16490,
                            model_formula = c("y ~ x1:x2+x1+x2") )
## Running 50,000 tests...
stopCluster(cl)

Out<-results(power_lm_int) %>%
  group_by(N.test,r.test) %>%
  summarise(
    power_x1=mean(sig_x1),
    power_x2=mean(sig_x2),
    power_int=mean(sig_int), .groups = 'drop')


# 3rd ggplot
  ggplot(data = Out,aes(x = N.test,y = power_int,color=as.factor(r.test)))+
  scale_color_viridis_d(option = c("C"))+
  geom_hline(yintercept = 0.8,color="red",size=2)+ 
  geom_hline(yintercept = 0.05,color="green",size=2)+    
  geom_hline(yintercept = 1,color="green",size=2)+    
  geom_vline(xintercept = pwr.r.test(r = -0.16490,power = 0.8,sig.level = 0.05)$n,color="blue",size=2)+
  geom_smooth(formula = y ~ x,method = 'loess',se=F,size=1.5)+
  geom_line()+
  geom_point(size=3)+
  xlab(  c("Sample Size")  )  + ylab(c("Power") ) +
  scale_x_continuous(breaks=seq(0,500,by = 50)) +
  scale_y_continuous(breaks=seq(0,1,by = 0.2)) +
  theme_pubr()+
  ggtitle(label = "Simulated Data - effect of XM correlation")+
  font("xylab",size=15)+  font("xy",size=15)+ font("xy.text", size = 15) +  font("legend.text",size = 15)+font("legend.title",size = 15)+
  theme( 
    axis.ticks = element_line(size=2,color="black"),
    axis.ticks.length=unit(0.2,"cm"),
    axis.line.x = element_line(color = "black",size=1.5),
    axis.line.y = element_line(color = "black",size=1.5),
    legend.position='top',
    legend.key=element_rect(size=2),
    legend.key.size = unit(1.5, "lines"))

Each dot represents the percent of analyses (100 simulations per dot) that found p<0.05 results for the interaction term, at a given sample size. Sample size ranged from N=50–500, at increments of N=50. Dots are connected by straight lines, and a smoothed curve was added. Green horizontal lines are at Power = 0.05 and Power = 1.00 (5% and 100% power, respectively). The red horizontal line is at 80% power. The blue vertical line is at N=285, the sample size required to detect a interaction effect of -0.16490 with 80% power in a regression.

Obviously, we see that if the correlation between 2 interactions increasing so the sample size decreasing as proportional. In other words, the correlation between main effects, become medium-to-large, a smaller sample is required to detect the same interaction effect. This is why we got a bigger sample size needed to detect the interaction effect in this case.

Final thoughts

  • Sample size determination depends on several factors. Here we just looked at the the essential factors.

  • We should be better powered to detect interactions, relative to similarly-sized main effects.

  • Next steps, I should dig deeper what was beyond the calculation in practical perspective.

References