A researcher consults you about a plan to investigate the effect of Ritalin on study success. He has already recruited two groups of \(30\) participants each.
The researcher expects that the control group will score a \(7.3\) on average, and that the Ritalin group will score a \(7.7\) on average, both with a standard deviation of \(2.5\). Simulate data based on his expectation, and the number of recruited participants (assume the data are normally distributed).
set.seed(123)
participants <- 1:30
control <- numeric()
ritalin <- numeric()
for (i in 1:length(participants)) {
control[i] <- rnorm(1, 7.3, 2.5)
ritalin[i] <- rnorm(1, 7.7, 2.5)
}
control
## [1] 5.898811 11.196771 7.623219 8.452291 5.582868 10.360204 8.301929
## [8] 5.910397 8.544626 9.053390 4.630441 4.734989 5.737402 9.394468
## [15] 4.454658 8.366161 9.537814 9.353953 8.684794 6.535093 5.563233
## [22] 4.136509 10.319905 6.292788 9.249913 7.933296 7.192824 6.735573
## [29] 3.428118 7.609636
ritalin
## [1] 7.124556 7.876271 11.987662 4.537347 6.585845 8.599535 7.976707
## [8] 12.167283 2.783457 6.518021 7.155063 5.877772 3.483267 8.083433
## [15] 10.834537 6.962321 9.895334 9.421601 7.545221 6.748822 7.180207
## [22] 13.122390 4.892229 6.533362 7.491577 7.628633 11.121506 11.491177
## [29] 9.161534 8.239854
Report the p-value for the difference between both groups, given an \(\alpha\) of \(.05\) and a one-sided test (Ritalin > control group). What would you conclude based on this result?
t_test <- t.test(ritalin, control, conf.level = .05, alternative= "greater")
t_test #Thus p-value is 0.1548
##
## Welch Two Sample t-test
##
## data: ritalin and control
## t = 1.0252, df = 55.857, p-value = 0.1548
## alternative hypothesis: true difference in means is greater than 0
## 5 percent confidence interval:
## 1.597301 Inf
## sample estimates:
## mean of x mean of y
## 7.967551 7.360536
Concluding from this test, the null-hypothesis will not be rejected
Now do a more formal analysis, repeating the data generation 1000 times and calculate the power of the experiment, using your simulation (donโt use the - function, but base the answer on your simulation)
control_sim <- numeric()
ritalin_sim <- numeric()
for (i in 1:1000) {
control_sim[i] <- mean(rnorm(30, 7.3, 2.5))
ritalin_sim[i] <- mean(rnorm(30, 7.7, 2.5))
}
t_test <- t.test(ritalin_sim, control_sim, conf.level = .05, alternative= "greater")
t_test
##
## Welch Two Sample t-test
##
## data: ritalin_sim and control_sim
## t = 20.568, df = 1994.5, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 5 percent confidence interval:
## 0.4366496 Inf
## sample estimates:
## mean of x mean of y
## 7.703118 7.298817
Based on this result. What should the researcher conclude? (If you did not succeed in the previous question use the code below to calculate the power; result will be slightly different)
power.t.test(30, sd = 1.5, delta = 0.25, alternative = "one.sided")
The researcher should conclude that there is a difference between ritalin and a control sample. The difference was not observed in the first experiment due to a small sample size. After simulating the same experiment over 1000 times the p-value was significant indicating a difference between ritalin and the control. This is what power is, by increasing the sample size, the probability of seeing small differences between the samples becomes more easy.