In order to run our initial independent-samples t-test, we upload, inspect and sort the depression data:
data = read.table("http://matt.colorado.edu/teaching/stats/labs/lab7-1.txt", header=TRUE)
data #display data
## condition score
## 1 control 35
## 2 control 21
## 3 control 13
## 4 experimental 21
## 5 experimental 30
## 6 experimental 23
## 7 experimental 14
## 8 experimental 25
## 9 experimental 33
## 10 control 26
## 11 experimental 13
## 12 control 22
## 13 experimental 11
## 14 control 14
## 15 experimental 9
## 16 experimental 28
## 17 control 27
## 18 control 30
## 19 control 16
## 20 experimental 15
## 21 control 25
## 22 experimental 28
## 23 control 30
## 24 control 31
## 25 experimental 28
## 26 experimental 26
## 27 experimental 19
## 28 control 17
## 29 control 11
## 30 control 30
summary(data)
## condition score
## control :15 Min. : 9.00
## experimental:15 1st Qu.:15.25
## Median :24.00
## Mean :22.37
## 3rd Qu.:28.00
## Max. :35.00
cont.group = data$score[data$condition=="control"] #pick out scores from control group participants
exp.group = data$score[data$condition=="experimental"] #pick out scores from experimental group participants
Altogether the data consists of 30 observations unordered with respect to condition (‘experimental’ or ‘control’) and depression score. Since we sorted the depression scores by condition, we can compute their difference in means and their individual standard errors:
meanDiff = mean(cont.group)-mean(exp.group) #compute and store sample mean difference in variable 'meanDiff'
meanDiff
## [1] 1.666667
SEM.cont = sd(cont.group)/sqrt(length(cont.group)) #compute and store standard error of control group scores in variable 'SEM.cont'
SEM.cont
## [1] 1.949847
SEM.exp = sd(exp.group)/sqrt(length(exp.group)) #compute and store standard error of experimental group scores in variable 'SEM.cont'
SEM.cont
## [1] 1.949847
It appears that the difference in sample means is less than the individual standard error of either condition, implying that these results can be entirely explained by sampling variability. To complete the full test and confirm this result, we can apply the \(t.test\) function to our data:
t.test(cont.group,exp.group,var.equal=TRUE) #var.equa=TRUE contrains the variances of both samples to be equal (note that this isn't generally the case)
##
## Two Sample t-test
##
## data: cont.group and exp.group
## t = 0.60228, df = 28, p-value = 0.5518
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.001827 7.335160
## sample estimates:
## mean of x mean of y
## 23.20000 21.53333
With \(p = .5518\), we cannot reject the null hypothesis and therefore conclude that there is no evidence suggesting the SSRIs reliably improve depression in our data. If we were planning on presenting our results in a more formal way, we could write something like this:
An independent-samples t-test showed the average depression score for patients given the medication, \(M_{experimental} = 21.53\), and the average for patients given a placebo, \(M_{control}\) = 23.20, were not reliably different, \(t(28) = 0.60, p = .55\).
If our results had gone the other way, we could write something like this:
An independent-samples t-test showed the average depression score for patients given the medication, \(M_{experimental} = 20.53\), and the average for patients given a placebo, \(M_{control} = 26.30\), reliably differed, \(t(28) = 2.08, p = .046\).
In order to run our initial independent-samples t-test we again upload, inspect and sort the our data:
data2 = read.table("http://matt.colorado.edu/teaching/stats/labs/lab7-2.txt", header=TRUE)
data2 #display data
## medication placebo
## 1 33 35
## 2 14 16
## 3 26 25
## 4 29 29
## 5 20 19
## 6 30 32
## 7 13 14
## 8 12 13
## 9 19 22
## 10 21 23
## 11 20 22
## 12 23 27
## 13 24 28
## 14 32 34
## 15 27 28
summary(data2)
## medication placebo
## Min. :12.00 Min. :13.00
## 1st Qu.:19.50 1st Qu.:20.50
## Median :23.00 Median :25.00
## Mean :22.87 Mean :24.47
## 3rd Qu.:28.00 3rd Qu.:28.50
## Max. :33.00 Max. :35.00
n = length(data2$medication) #store sample size (15)
Since we are running a paired-samples t-test, we now compute difference scores for each of the participants in the study and, in turn, the standard error of these scores:
diff = data2$placebo - data2$medication
diff #we see that depression scores are generally higher during the placebo treatment period
## [1] 2 2 -1 0 -1 2 1 1 3 2 2 4 4 2 1
SEMdiff = sd(diff)/sqrt(n)
SEMdiff
## [1] 0.3879126
With these values in hand, we can now run a paired-samples t-test both by hand and automatically:
t = mean(diff)/SEMdiff #compute t "by hand"
t
## [1] 4.12464
p = pt(-abs(t),n-1) * 2 #compute p "by hand"
p
## [1] 0.001031319
t.test(diff,mu=0) #run the test "automatically"
##
## One Sample t-test
##
## data: diff
## t = 4.1246, df = 14, p-value = 0.001031
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.7680102 2.4319898
## sample estimates:
## mean of x
## 1.6
Our results agree and tell us that we ought to reject the null hypothesis and conclude that the medication and placebo depression scores reliably differed. If we wanted to present our results more formally, we could write something like this:
A paired-samples t-test showed patients’ average depression score while on the medication, \(M_{medication} = 22.87\), and the average while on placebo, \(M_{placebo} = 24.47\), reliably differed, \(t(14) = 4.12, p = .001\).
Ultimately, paired-sample t-tests are often preferably because they reduced variability: a participants overall level of depression “cancels out” when considering that same person’s scores in both the placebo and medication conditions. We can see this clearly if we display the raw scores and difference scores side-by-side:
data2$placebo
## [1] 35 16 25 29 19 32 14 13 22 23 22 27 28 34 28
data2$medication
## [1] 33 14 26 29 20 30 13 12 19 21 20 23 24 32 27
diff
## [1] 2 2 -1 0 -1 2 1 1 3 2 2 4 4 2 1
The raw score differences between subjects are substantial, but the difference scores differences are relatively small. Indeed, if we were to treat the raw placebo and medication scores as independent - coming from two randomly selected groups of people - then we see that the standard errors of each group are more substantial then that of the difference scores. Moreover, running an independent-sample t-test on the same data reverses our paired sample t-test results - it’s no longer significant!
SEMraw.placebo = sd(data2$placebo)/sqrt(n)
SEMraw.placebo
## [1] 1.783166
SEMraw.medication = sd(data2$medication)/sqrt(n)
SEMraw.medication
## [1] 1.734432
SEMdiff
## [1] 0.3879126
t.test(data2$medication,data2$placebo)
##
## Welch Two Sample t-test
##
## data: data2$medication and data2$placebo
## t = -0.6432, df = 27.979, p-value = 0.5253
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.695704 3.495704
## sample estimates:
## mean of x mean of y
## 22.86667 24.46667