Topics: Confidence intervals & P-values & Power & Errors in Inferences
In the course syllabus you can find the deadline for the assignment and the suggested literature.
What does Vasishth mean when he states that the population mean is not a random variable? (p. 61)
He means that the population mean is a parameter we don’t know but is fixed and can’t vary. Using sample means (which is random) we hope to get insight in the population mean, but the sample mean can vary over different times and using different sample sizes etc.
Read chapter 3.6 and 3.7. How do you change the \(\alpha\)-level in the t.test() function?
t.test(x, y, conf.level=“specify your alpha value”), for an alpha of .05 specify the confidence level at .05 (which is the default).
During the lecture you measured my length. Describe two ways to ensure that your collective guess is closer to my true length.
Given that \(\bar{x} = 5\), \(SD = 2\) and the sample size is 50, what is the 90% CI.
qnorm(.05, 0, 1)
## [1] -1.644854
# To calculate the standard error:
2/sqrt(50)
## [1] 0.2828427
# The interval (lower bound)
5 - 1.644854 * 0.2828427
## [1] 4.534765
# The interval (greater bound)
5 + 1.644854 * 0.2828427
## [1] 5.465235
# So the confidence interval
# 4.534765 to 5.465235
What is the probability that 0 (hypothesis about the true population mean) is located witin this CI?
The probability is zero, 0 isn’t located between 4.5 and 5.4
What happens with the CI if the sample size increases?
The CI will narrow when the sample size increases
How big is the CI if we have data from the full population?
When you know the full population you can calculate the true mean, when you know the true mean you don’t need a confidence interval. But if you were to calculate it from the full population, you’d get just a regular interval (the narrowness depending on the population size) on the 95% (if you’d select this for the confidence interval), but you would be 100% sure that the mean is within this confidence interval whatever size it is.
Assume \(\alpha = .5\). What is the probability that \(\mu\) is located within the estimated CI, when we keep repeating the experiment? Check your answer with some simulation based on: \(\mu = 10\), \(\sigma = 3\) and \(N=50\).
#get the z-value of alpha=.5
qnorm(.25, 0, 1)
## [1] -0.6744898
#simulation
bounds <- matrix(NA, 100, 3)
for(i in 1:100) {
mysample <- rnorm(50, 10, 3) # some sample
se_mysample <- sd(mysample)/sqrt(length(mysample))
left_border <- mean(mysample) -0.6744898 * se_mysample
right_border <- mean(mysample) + 0.6744898 * se_mysample
in_interval <- 10 > left_border & 10 < right_border # 1 Yes; 0 No
bounds[i,] <- c(left_border, right_border, in_interval)
}
bounds[ ,3]
## [1] 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 1 0 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1
## [38] 0 0 0 1 1 0 0 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 0
## [75] 1 1 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 1
hist(bounds[ ,3])
As you can see in a simulation of 100 with a alpha of .5, the population mean will fall almost half of the times in the confidence interval. Now let’s increase the number of simulations to get a more accurate view.
bounds <- matrix(NA, 10000, 3)
for(i in 1:10000) {
mysample <- rnorm(50, 10, 3) # some sample
se_mysample <- sd(mysample)/sqrt(length(mysample))
left_border <- mean(mysample) -0.6744898 * se_mysample
right_border <- mean(mysample) + 0.6744898 * se_mysample
in_interval <- 10 > left_border & 10 < right_border # 1 Yes; 0 No
bounds[i,] <- c(left_border, right_border, in_interval)
}
hist(bounds[ ,3])
As you can see the probability is now almost 50/50. Which makes sense since the confidence interval parameter is set at .5 thus a .5 probability that the population mean will fall within the confidence interval.
Vasishth & Broe’s questions about the literature.
A 95% confidence interval has a ?% chance of describing the sample mean:
A) 95%
B) 100% <- correct
For the same data, a 90% CI will be wider than a 95% CI.
A) True
B) False <- correct
In the lecture we showed that p-values are uniformly distributed given that \(H_{0}\) is true. Now describe the distribution if \(H_{0}\) is false. Give an explanation.
When the null-hypothesis is false the density distribution is weighted towards the p.000 chance in a exponential way.This is because the chance of the alternative hypothesis is being true, is getting bigger.
Use the R code from the lecture to investigate different scenarios. Explain in every scenario why the distribution of the p-values does or doesn’t change, and show how you’ve altered the code.
p_values = numeric()
N <- 5 # first sample of 5 students
for (i in 1:10000) {
s1 <- rnorm(N, 7, 1); s2 <- rnorm(N, 7, 1)
p_values[i] <- t.test(s1, s2)$p.value
}
h <- hist(p_values, br=50)
How is the distribution of p-values changed if:
there is an effect of Ritalin of study success?
By changing the mean of s1 from 7 to 6, you can see that the histogram is not just equal probabilities anymore. But now the probabilities are skewed towards the small p-values
p_values_a = numeric()
N <- 5 # first sample of 5 students
for (i in 1:10000) {
s1 <- rnorm(N, 6, 1); s2 <- rnorm(N, 7, 1)
p_values_a[i] <- t.test(s1, s2)$p.value
}
h3a <- hist(p_values_a, br=50, main= "h3a")
if this effect is bigger than the effect in question a?
p_values_b = numeric()
N <- 5 # first sample of 5 students
for (i in 1:10000) {
s1 <- rnorm(N, 5, 1); s2 <- rnorm(N, 7, 1)
p_values_b[i] <- t.test(s1, s2)$p.value
}
par(mfcol= c(1,2))
hist(p_values_a, br=50, main = "h3a")
h3b <- hist(p_values_b, br=50, main="h3b")
By changing the mean of s1 from 6 to 5, the effect gets bigger which shows also in the histogram that there is a larger skew towards the lower p-values. So a less probability that this difference in means is due to probability. ##### 3c.
if the power is bigger thant the power in question a?
Increase the sample size
p_values_c = numeric()
N <- 10 #sample of 10 students
for (i in 1:10000) {
s1 <- rnorm(N, 6, 1); s2 <- rnorm(N, 7, 1)
p_values_c[i] <- t.test(s1, s2)$p.value
}
par(mfcol= c(1,2))
hist(p_values_a, br=50, main = "h3a")
h3c <- hist(p_values_c, br=50, main="h3c")
So in this case also a larger skew towards a small p-value, due to the larger sample size there is less room for chance in the differences of the means.
if there is an effect and you use a one-sided rather than two-sided test?
p_values_d = numeric()
N <- 5 # first sample of 5 students
for (i in 1:10000) {
s1 <- rnorm(N, 6, 1); s2 <- rnorm(N, 7, 1)
p_values_d[i] <- t.test(s1, s2, alternative = "less")$p.value
}
par(mfcol= c(1,2))
hist(p_values_a, br=50, main = "h3a")
h3d <- hist(p_values_d, br=50, main="h3d")
A stronger skew towards smaller p-values than in the two-sided t-test.
if \(\alpha\) is .5 rather than .05?
p_values_e = numeric()
N <- 5 # first sample of 5 students
for (i in 1:10000) {
s1 <- rnorm(N, 6, 1); s2 <- rnorm(N, 7, 1)
p_values_e[i] <- t.test(s1, s2, conf.level = 0.5)$p.value
}
par(mfcol= c(1,2))
hist(p_values_a, br=50, main = "h3a")
h3e <- hist(p_values_e, br=50, main="h3e")
Explanation: I don’t see a big difference. This is because the results of the t.test (p-values) are the same only the confidence interval changes.
Does the confidence interval of a sample mean changes if you choose a different \(H_{0}\)? Explain.
No, the confidence interval depends on the sample mean, and the standard deviation. The \(H_{0}\) is independent of those.
Does the confidence interval of a sample mean changes if you choose a different \(\alpha\)? Explain.
Yes, to calculate the confidence interval, you need Z-scores. These Z-scores depend on the alpha level you choose. The higher the alpha you choose, the narrower the confidence interval. qnorm(.05, 0, 1)
True or false? The p-value is the probability of the null hypothesis being true.
False
True or false?The p-value is the probability that the result occurred by chance.
True
Imagine that 100 researchers study whether vegetarians have a higher IQ, when in reality there is no effect. How often do you expect a significant result nonetheless? Assume that the researchers perform proper studies with \(\alpha = .05\).
In about 5 cases.
What statistical error in the conclusion of the researchers is made when a significant result is found?
A type 1 error, a false alarm. So the null hypothesis is being rejected when in fact the result is due to chance rather than to an actual effect.
Assume journals only publish significant effects. Is the observed effect size in those journals different from the true effect size? Explain.
It could be, when the researchers are p-hacking their results. So there probably will be low effect sizes but still having a significant result. When in fact there is no significant result, and thus no effect size at all.
If you want to increase the probability of finding an effect when in reality there is no effect, should you use many or few participants? Or doesn’t it matter? Why?
Fewer, in that way there is a bigger chance of finding effects which would have been filtered out by having more participants. In the example of having few participants, the chance of having different results in the two groups are more likely to be found significant. Because the bigger the sample, the closer to the population, and the population doesn’t have an effect.
And if there is an effect?
Increase the sample size, there is an effect in the population so by increasing the sample you’d get closer to the population. Thus finding the actual population effect.
You read an article about the influence of Ritalin on study success. The researchers don’t find a difference between students that use Ritalin and students that don’t use Ritalin. You notice that the research was done with very few participants and thus low power. The researchers conclude that Ritalin has no effect.
What do you think of this conclusion? Do you share their opinion or would you conclude something else? If so, what would you conclude?
I would conclude that for this research, it would be wise to stick to the null-hypothesis. Though, this doesn’t mean that it’s necessarily the case that Ritalin doesn’t have an effect on study success, by replicating the study with more power more certainty can be acquired on the results of Ritalin on study success.
You read an article about the influence of Ritalin on study success. The researchers do find a difference between students that use Ritalin and students that don’t use Ritalin. You notice that the research was done with very few participants and thus low power. The researchers conclude that Ritalin has an effect. Assume it’s a robust experiment and we trust the researchers. How could they have found an effect, even in such a small sample?
When the (too small) sample mean was very different from the null-hypothesis, due to probability instead of an actual effect.
Vasishth & Broe’s questions about the literature.
Suppose you carry out a between-participants experiment where you have a control and treatment group, say 20 participants in each sample. You carry out a two-sample t-test and find that the result is \({t}(18)=2.7, p<.01\). Which of the statements below are true?
You have absolutely disproved the null hypothesis.
No you haven’t disproved it absolutely. In this case you’d reject the null-hypothesis under the assumption that there is a probability, smaller than alpha .01 that the null-hypothesis was rejected due a type-I error.
The probability of the null hypothesis being true is 0.01.
No it’s not p=0.01, but p<0.01
You have absolutely proved that there is a difference between the two means.
No, you’ve proven that in this sample there is a 0.01 probability that there is no difference between the two means.
You have a reliable experimental finding in the sense that if you were to repeat the experiment 100 times, in 99% of the cases you would get a significant result.
Yes.