Statistical Analysis

Load the library function

library(bitops)
library(RCurl)
library(tidyverse)
library(ggplot2)

Get the Dryeye table from Github and read the csv file

dryeye<- getURL("https://raw.githubusercontent.com/sookuan/S-SB-Workshop/master/00-R-In-Class/DryEye.csv")
DryEye<- read.csv(text = dryeye)

Examine the DryEye table

head(DryEye)

##    notreat      herb
## 1 2.372074 1.4766453
## 2 1.957100 1.3228087
## 3 1.996542 0.9665124
## 4 2.087404 1.3491698
## 5 1.815421 1.2702050
## 6 1.402811 1.0759150

str(DryEye)

## 'data.frame':    50 obs. of  2 variables:
##  $ notreat: num  2.37 1.96 2 2.09 1.82 ...
##  $ herb   : num  1.477 1.323 0.967 1.349 1.27 ...

dim(DryEye)

## [1] 50  2

Draw Boxplot on the conjunctival redness before herb treatment

DryEye %>%
  gather(treatment, redness, notreat:herb) %>%
  filter(treatment == "notreat") %>%
  ggplot(aes(x=treatment, y = redness)) + 
  geom_boxplot(outlier.colour='red', outlier.shape = 8, outlier.size=4) + 
  labs(title="Conjunctival Redness before Herb Treatment", x = "Before Herb Treatment", y = "Conjunctival Redness")

There is no outlier for the “notreat” boxplot

Draw Boxplot on the conjunctival redness after herb treatment

DryEye %>%
  gather(treatment, redness, notreat:herb) %>%
  filter(treatment == "herb") %>%
  ggplot(aes(x=treatment, y = redness)) + 
  geom_boxplot(outlier.colour='red', outlier.shape = 8, outlier.size=4) + 
  labs(title="Conjunctival Redness after Herb Treatment", x = "After Herb Treatment", y = "Conjunctival Redness")

There is outlier at upper and lower whisker for the “herb” boxplot

Check the normality of notreat sample

plot histogram to understand the distribution

DryEye %>%
  gather(treatment, redness, notreat:herb) %>%
  filter(treatment == "notreat") %>%
  ggplot(aes(x = redness)) +
  geom_histogram()

The histogram seems posses normal distibution

check the data by using qqnorm and qqline

qqnorm(DryEye$notreat, pch = 1)
qqline(DryEye$notreat, col = "red", lwd = 2)

### The data exhibit straight line = normal distribute

Check the normality of hearb sample

plot histogram to understand the distribution

DryEye %>%
  gather(treatment, redness, notreat:herb) %>%
  filter(treatment == "herb") %>%
  ggplot(aes(x = redness)) +
  geom_histogram()

The histogram seems posses normal distribution if exclude 2 extreme outliers

check the data by using qqnorm, qqline and shapiro.test

qqnorm(DryEye$herb, pch = 1)
qqline(DryEye$herb, col = "red", lwd = 2)

### The data fit the straight line if exclude the outlier ~ normal distribute

Check the normality of the difference between no treat and herb sample

Comparing both “notreat” and “herb” in qqplot

qqplot(DryEye$notreat, DryEye$herb)

Both “notreat” and “herb” almost fit a straight

For “notreat”

shapiro.test(DryEye$notreat)

## 
##  Shapiro-Wilk normality test
## 
## data:  DryEye$notreat
## W = 0.98351, p-value = 0.7062

From the output(p-value = 0.7062), the p-value > 0.05 implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.From the output, the p-value > 0.05 implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.

For “herb”

shapiro.test(DryEye$herb)

## 
##  Shapiro-Wilk normality test
## 
## data:  DryEye$herb
## W = 0.97509, p-value = 0.3678

From the output(p-value = 0.3678), the p-value > 0.05 implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.

The normality difference between notreat and herb = 0.7062 - 0.3678 = 0.3384.

“notreat” data is more normally distributed than “herb’ data.

Perform the correct t-test for determine if there was significant change in conjunctive redness after herb treatment

t.test(DryEye$notreat, DryEye$herb)

## 
##  Welch Two Sample t-test
## 
## data:  DryEye$notreat and DryEye$herb
## t = 3.4944, df = 77.329, p-value = 0.0007897
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1235924 0.4509954
## sample estimates:
## mean of x mean of y 
##  1.303158  1.015864

Statistical Analysis

Load the library function

Get the Dryeye table from Github and read the csv file

Examine the DryEye table

Draw Boxplot on the conjunctival redness before herb treatment

There is no outlier for the “notreat” boxplot

Draw Boxplot on the conjunctival redness after herb treatment

There is outlier at upper and lower whisker for the “herb” boxplot

Check the normality of notreat sample

plot histogram to understand the distribution

The histogram seems posses normal distibution

check the data by using qqnorm and qqline

Check the normality of hearb sample

plot histogram to understand the distribution

The histogram seems posses normal distribution if exclude 2 extreme outliers

check the data by using qqnorm, qqline and shapiro.test

Check the normality of the difference between no treat and herb sample

Comparing both “notreat” and “herb” in qqplot

Both “notreat” and “herb” almost fit a straight

For “notreat”

For “herb”

From the output(p-value = 0.3678), the p-value > 0.05 implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.

The normality difference between notreat and herb = 0.7062 - 0.3678 = 0.3384.

“notreat” data is more normally distributed than “herb’ data.

Perform the correct t-test for determine if there was significant change in conjunctive redness after herb treatment

The t-score is 3.4944 which is greater than 0.05 significant level 1.9673.

Therefore we reject the null hyphothesis that mean of “notreat” equal to “herb”

t-score > than 1.9673, “notreat” mean > “herb” mean.

There is significant change in conjunctival redness after herb treatment.