If you find this document helpful for your research please do not forget to cite the references listed in this document. For this document itself you can cite Aydın et al. (2018):
Aydın, B., Algina, J., Leite, W. L., & Atilgan, H. (2018). An R Companion: A Compact Introduction for Social Scientists. Ankara: Anı
To examine whether surgical tape or suture is a better method for closing wounds, for each of 20 rats incisions were made on both sides of the spine. One of the wounds was closed by using tape; the other was sutured. The side closed by tape was determined at random. After 10 days the tensile strength of the wounds was measured. The following are the data.
wounds=data.frame(ratid=1:20,
tape=c(6.59,9.84 ,3.97,5.74,4.47,4.79,6.76,7.61,6.47,5.77,
7.36,10.45,4.98,5.85,5.65,5.88,7.77,8.84,7.68,6.89),
suture=c(4.52,5.87,4.60,7.87,3.51,2.77,2.34,5.16,5.77,5.13,
5.55,6.99,5.78,7.41,4.51,3.96,3.56,6.22,6.72,5.17))
# Create plot data
library(tidyr)
plotdata=gather(wounds, method, strength, tape:suture, factor_key=TRUE)
require(ggplot2)
ggplot(plotdata, aes(x = strength)) +
geom_histogram(aes(y = ..density..),col="black",alpha=0.7) +
geom_density(size=2) +
theme_bw()+labs(x = "strength")+ facet_wrap(~ method)+
theme(axis.text=element_text(size=15),
axis.title=element_text(size=14,face="bold"))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Wounds example
The following are the steps for conducting the dependent groups t-test and R code for implementing the steps
\[t=\frac{\bar{Y_1}-\bar{Y_2}}{\sqrt{\frac{S_1^2+S_2^2-2S_1 S_2 r_{12}}{n}}}\]
library(psych)
descDT=with(wounds,describe(cbind(tape,suture)))
descDT
## vars n mean sd median trimmed mad min max range skew kurtosis
## tape 1 20 6.67 1.71 6.53 6.54 1.45 3.97 10.45 6.48 0.55 -0.45
## suture 2 20 5.17 1.49 5.16 5.19 1.30 2.34 7.87 5.53 -0.08 -0.87
## se
## tape 0.38
## suture 0.33
corDT=with(wounds,cor(tape,suture,use="complete.obs"))
corDT
## [1] 0.3536491
# estimated standard error
ese=sqrt(((1.71^2+1.49^2)-(2*1.71*1.49*corDT))/(20))
# t-statistic
tstatistic=(6.67-5.17)/ese
# critical value for alpha=0.05
qt(.975,df=19)
## [1] 2.093024
Given 3.67 is grater than the critical value of \(t_{.975,19}=2.09\) , \(H_0\) is rejected
A more convenient R code would be;
library(psych)
with(wounds, t.test(tape,suture,paired=T,
alternative="two.sided",
conf.level=0.95))
##
## Paired t-test
##
## data: tape and suture
## t = 3.6678, df = 19, p-value = 0.001636
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.6429426 2.3520574
## sample estimates:
## mean difference
## 1.4975
A dependent groups t-test showed that the tensile strength after surgical tape (mean=6.67, SD=1.71, skew=0.55, kurtosis=-0.45) was statistically different than the tensile strength after the suture (mean=5.17, SD=1.49, skew=-0.08, kurtosis=-0.87), t(19)=3.67, p=0.002 ,r=0.35. The 95% confidence interval was [0.64,2.35].
The score difference (\(Y_{1i} - Y_{2i}\)) should be normally distributed and the difference scores should be independent.However,the dependent t test is expected to be robust to normality with large sample sizes.
When the departures from the normality is severe, a percentile bootstrap procedure can be employed (Wilcox, 2012, p. 201).
#Calculate 95% CI using bootstrap (normality is not assumed)
set.seed(04012017)
B=5000 # number of bootstraps
alpha=0.05 # alpha
wounds=data.frame(ratid=1:20,
tape=c(6.59,9.84 ,3.97,5.74,4.47,4.79,6.76,7.61,6.47,5.77,
7.36,10.45,4.98,5.85,5.65,5.88,7.77,8.84,7.68,6.89),
suture=c(4.52,5.87,4.60,7.87,3.51,2.77,2.34,5.16,5.77,5.13,
5.55,6.99,5.78,7.41,4.51,3.96,3.56,6.22,6.72,5.17))
output=c()
for (i in 1:B){
#sample rows
bs_rows=sample(wounds$ratid,replace=T,size=nrow(wounds))
bs_sample=wounds[bs_rows,]
mean1=mean(bs_sample$tape)
mean2=mean(bs_sample$suture)
output[i]=mean1-mean2
}
output=sort(output)
## Uni-directional
# d star lower
output[as.integer(B*alpha/2)+1]
## [1] 0.717
# d star upper
output[B-as.integer(B*alpha/2)]
## [1] 2.2755
##Directional x2>x1
# d star lower
output[as.integer(B*alpha)+1]
## [1] 0.8485
#wrong direction x2<x1
# d star upper
output[as.integer(B*(1-alpha))]
## [1] 2.1495
The tensile strength after surgical tape (mean=6.67, SD=1.71, skew=0.55, kurtosis=-0.45) was statistically different than the tensile strength after the suture (mean=5.17, SD=1.49, skew=-0.08, kurtosis=-0.87) given that the 95% confidence interval was [0.667,2.2555].1:
A simple effect size formulae for a dependent t test is (Equation 7 in Lakens (2013))2;
\[ES=\frac{t}{\sqrt{n}}\]
## the normality and the equal variances assumptions are made
## given the robust procedures provided roughly the same results
n=20
tval=3.6678
ES=tval/sqrt(n)
ES
## [1] 0.820145
library(effsize)
cohen.d(wounds$tape,wounds$suture,
paired=T, conf.level=0.95,noncentral=F)
##
## Cohen's d
##
## d estimate: 0.9324681 (large)
## 95 percent confidence interval:
## lower upper
## 0.3159907 1.5489454
The effsize package (Torchiano, 2016) reported an effect size of 0.820 and the 95% CI was [0.135, 1.505]
To be added
To be added
The basics of statistical power were provided in earlier slides.
#power.t.test
power.t.test(delta=.35, sd=.6,sig.level=0.05, power=0.95,
type="paired", alternative="two.sided")
##
## Paired t test power calculation
##
## n = 40.16447
## delta = 0.35
## sd = 0.6
## sig.level = 0.05
## power = 0.95
## alternative = two.sided
##
## NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs
This illustration shows that for the pre-determined knowns of a mean difference of 0.35, a standard deviation of 0.6, an alpha level of 0.05, a non-directional test and a desired power of 0.95, the sample size (number of pairs) should be 41. In other words, the probability of rejecting \(H_0:\mu_1-\mu_2=0\), the null is .95 with a sample size of 41, a mean difference of 0.35, SD=0.6, alpha=0.05 and a non-directional paired t-test.
We first present examples of designs commonly used in studies in the social and behavioral sciences to compare two means. The steps used in such studies are
An important distinction in selecting a statistical test is whether the scores in the two treatments are correlated or independent. We classify the designs by whether the scores in the two treatments are correlated or independent. Then we turn to a presentation of terminology for describing designs. This terminology facilitates discussion of designs and determining the correct data analysis procedure to use with a design.
| Device | |
|---|---|
| Magnetic | Inactive |
| . | |
| . | |
| . |
Note that there is no way to pair the scores and therefore the scores cannot be correlated.
The descriptive statistics were calculated with the psych package (Revelle, 2016) and the non-directional percentile bootstrap method with 5000 replications was conducted with the base package (R Core Team, 2016).↩︎
it goes to infinity as r goes to 1 even when the means are very similar. Equation 10 in Lakens (2013) is more appropriate which is \(\frac{mean difference}{(SD_1+SD_2)/2}\)↩︎