Independent-samples t-test

In order to run our initial independent-samples t-test, we upload, inspect and sort the depression data:

data = read.table("http://matt.colorado.edu/teaching/stats/labs/lab7-1.txt", header=TRUE)
data #display data
##       condition score
## 1       control    35
## 2       control    21
## 3       control    13
## 4  experimental    21
## 5  experimental    30
## 6  experimental    23
## 7  experimental    14
## 8  experimental    25
## 9  experimental    33
## 10      control    26
## 11 experimental    13
## 12      control    22
## 13 experimental    11
## 14      control    14
## 15 experimental     9
## 16 experimental    28
## 17      control    27
## 18      control    30
## 19      control    16
## 20 experimental    15
## 21      control    25
## 22 experimental    28
## 23      control    30
## 24      control    31
## 25 experimental    28
## 26 experimental    26
## 27 experimental    19
## 28      control    17
## 29      control    11
## 30      control    30
summary(data)
##         condition      score      
##  control     :15   Min.   : 9.00  
##  experimental:15   1st Qu.:15.25  
##                    Median :24.00  
##                    Mean   :22.37  
##                    3rd Qu.:28.00  
##                    Max.   :35.00
cont.group = data$score[data$condition=="control"] #pick out scores from control group participants
exp.group = data$score[data$condition=="experimental"] #pick out scores from experimental group participants

Altogether the data consists of 30 observations unordered with respect to condition (‘experimental’ or ‘control’) and depression score. Since we sorted the depression scores by condition, we can compute their difference in means and their individual standard errors:

meanDiff = mean(cont.group)-mean(exp.group) #compute and store sample mean difference in variable 'meanDiff'
meanDiff
## [1] 1.666667
SEM.cont = sd(cont.group)/sqrt(length(cont.group)) #compute and store standard error of control group scores in variable 'SEM.cont'
SEM.cont
## [1] 1.949847
SEM.exp = sd(exp.group)/sqrt(length(exp.group)) #compute and store standard error of experimental group scores in variable 'SEM.cont'
SEM.cont
## [1] 1.949847

It appears that the difference in sample means is less than the individual standard error of either condition, implying that these results can be entirely explained by sampling variability. To complete the full test and confirm this result, we can apply the \(t.test\) function to our data:

t.test(cont.group,exp.group,var.equal=TRUE) #var.equa=TRUE contrains the variances of both samples to be equal (note that this isn't generally the case)
## 
##  Two Sample t-test
## 
## data:  cont.group and exp.group
## t = 0.60228, df = 28, p-value = 0.5518
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.001827  7.335160
## sample estimates:
## mean of x mean of y 
##  23.20000  21.53333

With \(p = .5518\), we cannot reject the null hypothesis and therefore conclude that there is no evidence suggesting the SSRIs reliably improve depression in our data. If we were planning on presenting our results in a more formal way, we could write something like this:

An independent-samples t-test showed the average depression score for patients given the medication, \(M_{experimental} = 21.53\), and the average for patients given a placebo, \(M_{control}\) = 23.20, were not reliably different, \(t(28) = 0.60, p = .55\).

If our results had gone the other way, we could write something like this:

An independent-samples t-test showed the average depression score for patients given the medication, \(M_{experimental} = 20.53\), and the average for patients given a placebo, \(M_{control} = 26.30\), reliably differed, \(t(28) = 2.08, p = .046\).

Paired-samples t-test

In order to run our initial independent-samples t-test we again upload, inspect and sort the our data:

data2 = read.table("http://matt.colorado.edu/teaching/stats/labs/lab7-2.txt", header=TRUE)
data2 #display data
##    medication placebo
## 1          33      35
## 2          14      16
## 3          26      25
## 4          29      29
## 5          20      19
## 6          30      32
## 7          13      14
## 8          12      13
## 9          19      22
## 10         21      23
## 11         20      22
## 12         23      27
## 13         24      28
## 14         32      34
## 15         27      28
summary(data2)
##    medication       placebo     
##  Min.   :12.00   Min.   :13.00  
##  1st Qu.:19.50   1st Qu.:20.50  
##  Median :23.00   Median :25.00  
##  Mean   :22.87   Mean   :24.47  
##  3rd Qu.:28.00   3rd Qu.:28.50  
##  Max.   :33.00   Max.   :35.00
n = length(data2$medication) #store sample size (15)

Since we are running a paired-samples t-test, we now compute difference scores for each of the participants in the study and, in turn, the standard error of these scores:

diff = data2$placebo - data2$medication
diff #we see that depression scores are generally higher during the placebo treatment period
##  [1]  2  2 -1  0 -1  2  1  1  3  2  2  4  4  2  1
SEMdiff = sd(diff)/sqrt(n)
SEMdiff
## [1] 0.3879126

With these values in hand, we can now run a paired-samples t-test both by hand and automatically:

t = mean(diff)/SEMdiff #compute t "by hand"
t
## [1] 4.12464
p = pt(-abs(t),n-1) * 2 #compute p "by hand"
p
## [1] 0.001031319
t.test(diff,mu=0) #run the test "automatically"
## 
##  One Sample t-test
## 
## data:  diff
## t = 4.1246, df = 14, p-value = 0.001031
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.7680102 2.4319898
## sample estimates:
## mean of x 
##       1.6

Our results agree and tell us that we ought to reject the null hypothesis and conclude that the medication and placebo depression scores reliably differed. If we wanted to present our results more formally, we could write something like this:

A paired-samples t-test showed patients’ average depression score while on the medication, \(M_{medication} = 22.87\), and the average while on placebo, \(M_{placebo} = 24.47\), reliably differed, \(t(14) = 4.12, p = .001\).

The advantage of paired sample t-tests

Ultimately, paired-sample t-tests are often preferably because they reduced variability: a participants overall level of depression “cancels out” when considering that same person’s scores in both the placebo and medication conditions. We can see this clearly if we display the raw scores and difference scores side-by-side:

data2$placebo
##  [1] 35 16 25 29 19 32 14 13 22 23 22 27 28 34 28
data2$medication
##  [1] 33 14 26 29 20 30 13 12 19 21 20 23 24 32 27
diff
##  [1]  2  2 -1  0 -1  2  1  1  3  2  2  4  4  2  1

The raw score differences between subjects are substantial, but the difference scores differences are relatively small. Indeed, if we were to treat the raw placebo and medication scores as independent - coming from two randomly selected groups of people - then we see that the standard errors of each group are more substantial then that of the difference scores. Moreover, running an independent-sample t-test on the same data reverses our paired sample t-test results - it’s no longer significant!

SEMraw.placebo = sd(data2$placebo)/sqrt(n)
SEMraw.placebo
## [1] 1.783166
SEMraw.medication = sd(data2$medication)/sqrt(n)
SEMraw.medication
## [1] 1.734432
SEMdiff
## [1] 0.3879126
t.test(data2$medication,data2$placebo)
## 
##  Welch Two Sample t-test
## 
## data:  data2$medication and data2$placebo
## t = -0.6432, df = 27.979, p-value = 0.5253
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.695704  3.495704
## sample estimates:
## mean of x mean of y 
##  22.86667  24.46667