library(readxl)
Data2 <- read_excel("Data2.xlsx")
View(Data2)

ICT training data

before <- c(12.2, 14.6, 13.4, 11.2, 12.7, 10.4, 15.8, 13.9, 9.5, 14.2)
after <- c(13.5, 15.2, 13.6, 12.8, 13.7, 11.3, 16.5, 13.4, 8.7, 14.6)

data <- data.frame(subject = rep(c(1:10), 2), 
                   time = rep(c("before", "after"), each = 10),
                   score = c(before, after))
print(data)

##    subject   time score
## 1        1 before  12.2
## 2        2 before  14.6
## 3        3 before  13.4
## 4        4 before  11.2
## 5        5 before  12.7
## 6        6 before  10.4
## 7        7 before  15.8
## 8        8 before  13.9
## 9        9 before   9.5
## 10      10 before  14.2
## 11       1  after  13.5
## 12       2  after  15.2
## 13       3  after  13.6
## 14       4  after  12.8
## 15       5  after  13.7
## 16       6  after  11.3
## 17       7  after  16.5
## 18       8  after  13.4
## 19       9  after   8.7
## 20      10  after  14.6

str(data)

## 'data.frame':    20 obs. of  3 variables:
##  $ subject: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ time   : chr  "before" "before" "before" "before" ...
##  $ score  : num  12.2 14.6 13.4 11.2 12.7 10.4 15.8 13.9 9.5 14.2 ...

attach(data)

Descriptive analysis

summary(data)

##     subject         time               score      
##  Min.   : 1.0   Length:20          Min.   : 8.70  
##  1st Qu.: 3.0   Class :character   1st Qu.:11.97  
##  Median : 5.5   Mode  :character   Median :13.45  
##  Mean   : 5.5                      Mean   :13.06  
##  3rd Qu.: 8.0                      3rd Qu.:14.30  
##  Max.   :10.0                      Max.   :16.50

by(data = data, 
   INDICES = data[,"time"], 
   FUN = summary)

## data[, "time"]: after
##     subject          time               score      
##  Min.   : 1.00   Length:10          Min.   : 8.70  
##  1st Qu.: 3.25   Class :character   1st Qu.:12.95  
##  Median : 5.50   Mode  :character   Median :13.55  
##  Mean   : 5.50                      Mean   :13.33  
##  3rd Qu.: 7.75                      3rd Qu.:14.38  
##  Max.   :10.00                      Max.   :16.50  
## ------------------------------------------------------------ 
## data[, "time"]: before
##     subject          time               score      
##  Min.   : 1.00   Length:10          Min.   : 9.50  
##  1st Qu.: 3.25   Class :character   1st Qu.:11.45  
##  Median : 5.50   Mode  :character   Median :13.05  
##  Mean   : 5.50                      Mean   :12.79  
##  3rd Qu.: 7.75                      3rd Qu.:14.12  
##  Max.   :10.00                      Max.   :15.80

Summary statistics of the difference in scores before and after ICT training

diff = after - before
summary(diff)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -0.800   0.250   0.650   0.540   0.975   1.600

Association or correlation test

library(stats)
cor.test(x = before, y = after, 
         method = c("pearson"), 
         conf.level = 0.95)

## 
##  Pearson's product-moment correlation
## 
## data:  before and after
## t = 7.5468, df = 8, p-value = 6.628e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7474506 0.9851801
## sample estimates:
##       cor 
## 0.9363955

Interpretaion

The results show that there is a significantly strong relationship between before and after training scores. The correlation coefficient is 0.940.94 which is very close to one. It reflects a strong positive relationship or association in before and after ICT training scores. # Visualizing samples

par(pty = "s")
boxplot(score ~ time)

boxplot(score ~ time, 
        col = c("#003C67FF", "#EFC000FF"),
        main = "ICT training score improves knowlege",
        xlab = "Time", ylab = "Score")

Homogeneity in variances

bartlett.test(score ~ time)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  score by time
## Bartlett's K-squared = 0.050609, df = 1, p-value = 0.822

The probability value is 0.820.82 which is higher than 0.050.05. This indicated that there is no significant difference in variances. It also means variances are homogeneous.

Paired sample t-test

t.test(formula = score ~ time,
       alternative = "greater",
       mu = 0, 
       paired = TRUE,   
       var.equal = TRUE,
       conf.level = 0.95)

## 
##  Paired t-test
## 
## data:  score by time
## t = 2.272, df = 9, p-value = 0.0246
## alternative hypothesis: true mean difference is greater than 0
## 95 percent confidence interval:
##  0.1043169       Inf
## sample estimates:
## mean difference 
##            0.54

Interpretation

Statistical significance is determined by looking at the p-value. The p-value gives the probability of observing the test results under the null hypothesis. The lower the p-value, the lower the probability of obtaining a result like the one that was observed if the null hypothesis was true. Thus, a low p-value indicates decreased support for the null hypothesis.

The results showed that the probability value is lower than 0.05. Lower the P-value, lower the evidence we have to support the null hypothesis. Based on this result, we shall reject the null hypothesis of no difference. It means ICT training significantly improved the participants’ knowledge.

Paired sample t-test using R

Marvanessa Dinorog

2022-09-28