1.

A researcher consults you about a plan to investigate the effect of Ritalin on study success. He has already recruited two groups of \(30\) participants each.

a (1 pt)

The researcher expects that the control group will score a \(7.3\) on average, and that the Ritalin group will score a \(7.7\) on average, both with a standard deviation of \(2.5\). Simulate data based on his expectation, and the number of recruited participants (assume the data are normally distributed).

set.seed(123)
participants <- 1:30
control <- numeric()
ritalin <- numeric()

for (i in 1:length(participants)) {
  control[i] <- rnorm(1, 7.3, 2.5)
  ritalin[i] <- rnorm(1, 7.7, 2.5)
}
control
##  [1]  5.898811 11.196771  7.623219  8.452291  5.582868 10.360204  8.301929
##  [8]  5.910397  8.544626  9.053390  4.630441  4.734989  5.737402  9.394468
## [15]  4.454658  8.366161  9.537814  9.353953  8.684794  6.535093  5.563233
## [22]  4.136509 10.319905  6.292788  9.249913  7.933296  7.192824  6.735573
## [29]  3.428118  7.609636
ritalin
##  [1]  7.124556  7.876271 11.987662  4.537347  6.585845  8.599535  7.976707
##  [8] 12.167283  2.783457  6.518021  7.155063  5.877772  3.483267  8.083433
## [15] 10.834537  6.962321  9.895334  9.421601  7.545221  6.748822  7.180207
## [22] 13.122390  4.892229  6.533362  7.491577  7.628633 11.121506 11.491177
## [29]  9.161534  8.239854

b (1 pt)

Report the p-value for the difference between both groups, given an \(\alpha\) of \(.05\) and a one-sided test (Ritalin > control group). What would you conclude based on this result?

t_test <- t.test(ritalin, control, conf.level = .05, alternative= "greater")
t_test #Thus p-value is 0.1548
## 
##  Welch Two Sample t-test
## 
## data:  ritalin and control
## t = 1.0252, df = 55.857, p-value = 0.1548
## alternative hypothesis: true difference in means is greater than 0
## 5 percent confidence interval:
##  1.597301      Inf
## sample estimates:
## mean of x mean of y 
##  7.967551  7.360536

Concluding from this test, the null-hypothesis will not be rejected

c (2 pts)

Now do a more formal analysis, repeating the data generation 1000 times and calculate the power of the experiment, using your simulation (donโ€™t use the - function, but base the answer on your simulation)

control_sim <- numeric()
ritalin_sim <- numeric()

for (i in 1:1000) {
  control_sim[i] <- mean(rnorm(30, 7.3, 2.5))
  ritalin_sim[i] <- mean(rnorm(30, 7.7, 2.5))
}

t_test <- t.test(ritalin_sim, control_sim, conf.level = .05, alternative= "greater")
t_test
## 
##  Welch Two Sample t-test
## 
## data:  ritalin_sim and control_sim
## t = 20.568, df = 1994.5, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 5 percent confidence interval:
##  0.4366496       Inf
## sample estimates:
## mean of x mean of y 
##  7.703118  7.298817

d (1 pt)

Based on this result. What should the researcher conclude? (If you did not succeed in the previous question use the code below to calculate the power; result will be slightly different)

power.t.test(30, sd = 1.5, delta = 0.25, alternative = "one.sided")

The researcher should conclude that there is a difference between ritalin and a control sample. The difference was not observed in the first experiment due to a small sample size. After simulating the same experiment over 1000 times the p-value was significant indicating a difference between ritalin and the control. This is what power is, by increasing the sample size, the probability of seeing small differences between the samples becomes more easy.