Exercise 21

Load dataset and packages

#install.packages("ggpubr")
library(ggpubr)

## 载入需要的程辑包：ggplot2

neckshoulder <- read.table('neckshoulder.txt', header = TRUE)

before <- neckshoulder[,1]
after <- neckshoulder[,2]

Summary Statistics

mu<-rbind(mean(before),mean(after))
sds<-rbind(sd(before),sd(after))
diff <- after - before 
mu_d<-mean(diff)
sd_d<-sd(diff)
results1<-cbind(mu,sds)
result<-rbind(results1,c(mu_d,sd_d))
colnames(result)<-c('Mean','SD')
rownames(result)<-c('Before','After','Difference')
round(result,2)

##             Mean    SD
## Before     78.62  9.89
## After      71.88 10.86
## Difference -6.75  8.23

Checking assumptions

Before conducting the test, the assumption that the paired differences are normally distributed must be satisfied. This can be tested by looking at a histogram or Q-Q plot.

# To plot a histogram of the paired differences:
hist(diff,main='Histogram for Difference in Cholesterol Levels',xlab='Differences')

# To draw a qq-plot and add a line where x=y to the qqplot to help assess normality
ggqqplot(diff)

Both the graphs suggest that the data are normally distributed.

shapiro.test(diff)

## 
##  Shapiro-Wilk normality test
## 
## data:  diff
## W = 0.98131, p-value = 0.9731

The Shapiro-Wilk Test tests the null hypothesis that the data are normally distributed versus the alternative that are not. For the Shapiro-Wilk Test, if the p-value is above 0.05 then we assume approximate normality. The test statistic is W= 0.97742 and the p-value = 0.9196 so normality can be assumed.

Conducting the paired t-test

Step-by-step

\[t = \frac{\bar D}{s / \sqrt{n}}\]

n<-length(diff)
t.stat <- mu_d/sd_d*sqrt(n)
t.stat

## [1] -3.279057

# find t Critical Values
df<-n-1
cv<-qt(p=.1/2, df, lower.tail=TRUE)
cv

## [1] -1.75305

# p value
p<-2*pt(-abs(t.stat),df)
p

## [1] 0.005072079

# confidence interval
error <- qt(0.95,df)*sd_d/sqrt(n)
left <- mu_d-error
right <- mu_d+error
c(left,right)

## [1] -10.358687  -3.141313

# rejected?
(abs(t.stat) > abs(cv))

## [1] TRUE

Using t.test command

t.test(after, before, alternative = "two.sided", paired = TRUE, conf.level = 0.9)

## 
##  Paired t-test
## 
## data:  after and before
## t = -3.2791, df = 15, p-value = 0.005072
## alternative hypothesis: true difference in means is not equal to 0
## 90 percent confidence interval:
##  -10.358687  -3.141313
## sample estimates:
## mean of the differences 
##                   -6.75

so the data suggest that the true average time during which elevation is below 30 degrees differs after the change from what it was before the change