library(readxl)
## Warning: package 'readxl' was built under R version 4.1.3
data2 <- read_excel("D:/COLLEGE 3RD YEAR/2nd SEMESTER/STAT 50 STATISTICAL SOFTWARE/MIDTERM/data2.xlsx")
## New names:
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * `` -> ...7
## * ...
data2
data2<-data2[, 1:2]
data2
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.3
## Warning: package 'tidyr' was built under R version 4.1.3
## Warning: package 'readr' was built under R version 4.1.3
## Warning: package 'forcats' was built under R version 4.1.3
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.1.3
library(rstatix)
## Warning: package 'rstatix' was built under R version 4.1.3
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
library(dplyr)
data2 <- data2 %>%
reorder_levels(Treatment, order = c("T0+", "T0-", "T1", "T2", "T3"))
library(dplyr)
library(rstatix)
data2%>%
group_by(Treatment)%>%
get_summary_stats(DPPH, type="common")
The mean for treatment T0+ is 0.054.
The mean for treatment T0- is 0.045.
The mean for treatment T1 is 0.153.
The mean for treatment T2 is 0.158.
The mean for treatment T3 is 0.150.
library(ggplot2)
ggplot(data2) +
aes(x = Treatment, y = DPPH, color = Treatment) +
geom_jitter() +
theme(legend.position = "none")
The graph shows plotting of the data per treatment. The factor is the treatment variable which contains 5 treatments namely, T0+, T0-, T1, T2, and T3.
library(tidyverse)
library(ggpubr)
library(rstatix)
ggboxplot(data2, x = "Treatment", y = "DPPH", fill="Treatment")
Based on the box plot, it is evident that there are differences exist among the five treatments (T0+, T0-, T1, T2, T3).
res_aov <- aov(DPPH ~ Treatment, data = data2)
par(mfrow = c(1, 2)) # combine plots
# histogram
hist(res_aov$residuals)
# QQ-plot
library(car)
## Warning: package 'car' was built under R version 4.1.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.1.3
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
qqPlot(res_aov$residuals,
id = FALSE # id = FALSE to remove point identification
)
As gleaned above, we can see that the histogram roughly form a bell curve. This means that it indicates that the residuals follow a normal distribution. Moreover, points in the QQ-plots roughly follow the straight line and most of them are within the confidence bands. This is also indicating that residuals follow approximately a normal distribution.
shapiro.test(res_aov$residuals)
##
## Shapiro-Wilk normality test
##
## data: res_aov$residuals
## W = 0.93435, p-value = 0.003052
The p-value of the Shapiro-Wilk on the residuals is smaller than the usual significance level of 0.05. Thus, we do reject the hypothesis that residuals follow a normal distribution (p-value = 0.003052).
library(car)
leveneTest(DPPH ~ Treatment,
data = data2)
The p-value being larger than the significance level of 0.05.
res.kruskal <- data2 %>% kruskal_test(DPPH ~ Treatment)
res.kruskal
Based on the p-value significant difference was observed among the five treatments.
The effect size values normally interpreted as 0.01- < 0.06 (small effect), 0.06 â < 0.14 (moderate effect) and >= 0.14 (large effect).
data2 %>% kruskal_effsize(DPPH ~ Treatment)
The effect size is large. This means that we can easily identify the significant differences based on small number of sample size.
res1<- data2 %>%
dunn_test(DPPH ~ Treatment, p.adjust.method = "bonferroni")
res1
Based on the pairwise comparison, there is a significant difference observed in T0+, T0-, T1, T2, and T3.
res1 <- res1 %>% add_xy_position(x = "Treatment")
ggboxplot(data2, x = "Treatment", y = "DPPH") +
stat_pvalue_manual(res1, hide.ns = TRUE) +
labs(
subtitle = get_test_label(res.kruskal, detailed = TRUE),
caption = get_pwc_label(res1))