30 20 15 10 10 60 20 25 20 30 10 5 50 40 20 10 10 0 20 50
res = c(30, 20, 15, 10, 10, 60, 20, 25, 20, 30, 10, 5, 50, 40, 20, 10, 10, 0, 20, 50)
mean(res)
## [1] 22.75
# Hypothesis test of the 'less than'
# H0: mu <= 20
t.test(res, alternative="greater", mu = 20, conf.level = 0.95)
##
## One Sample t-test
##
## data: res
## t = 0.75633, df = 19, p-value = 0.2294
## alternative hypothesis: true mean is greater than 20
## 95 percent confidence interval:
## 16.4629 Inf
## sample estimates:
## mean of x
## 22.75
Since the p-value is greater than the signficance level 0.05, we fail to reject the null hypothesis. Hence, there is no enough evidence to prove that the situation where Maramag residents would donate more than Php 20.00 is significant.
library(readxl)
pgviews<- read_excel("C:/Users/leocint/Desktop/Leocint/4th Year College(1st Sem.)/Experimental Design/Activities/4/1/pgviews.xlsx")
View(pgviews)
New<-head(pgviews,20)
New
## # A tibble: 20 x 3
## Subject Site Pages
## <dbl> <chr> <dbl>
## 1 1 B 2
## 2 2 B 6
## 3 3 A 5
## 4 4 B 7
## 5 5 A 3
## 6 6 B 2
## 7 7 B 6
## 8 8 A 1
## 9 9 A 3
## 10 10 A 4
## 11 11 B 6
## 12 12 B 6
## 13 13 B 4
## 14 14 A 5
## 15 15 A 3
## 16 16 A 6
## 17 17 B 6
## 18 18 B 3
## 19 19 A 4
## 20 20 B 7
library(ggpubr)
## Loading required package: ggplot2
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
group_by(New, Site) %>%
summarise(
count = n(),
mean = mean(Pages, na.rm = TRUE),
sd = sd(Pages, na.rm = TRUE)
)
## # A tibble: 2 x 4
## Site count mean sd
## <chr> <int> <dbl> <dbl>
## 1 A 9 3.78 1.48
## 2 B 11 5 1.90
# Plot weight by group and color by group
ggboxplot(New, x = "Site", y = "Pages",
color = "Site", palette = c("#00AFBB", "#E7B800"),
ylab = "Pages", xlab = "Site")
Preleminary test to check independent t-test assumptions
Assumption 1: Are the two samples independents? Yes, since the samples from A and B are not related.
# Shapiro-Wilk normality test for A's Pages
with(New, shapiro.test(Pages[Site == "A"]))# p = 0.1
##
## Shapiro-Wilk normality test
##
## data: Pages[Site == "A"]
## W = 0.94976, p-value = 0.6874
# Shapiro-Wilk normality test for B's weights
with(New, shapiro.test(Pages[Site == "B"])) # p = 0.6
##
## Shapiro-Wilk normality test
##
## data: Pages[Site == "B"]
## W = 0.81668, p-value = 0.0156
res.ftest <- var.test(Pages ~ Site, data = New)
res.ftest
##
## F test to compare two variances
##
## data: Pages by Site
## F = 0.60957, num df = 8, denom df = 10, p-value = 0.4948
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.1581284 2.6181715
## sample estimates:
## ratio of variances
## 0.6095679
The p-value of F-test is p = 0.4948. It’s greater than the significance level alpha = 0.05. In conclusion, there is no significant difference between the variances of the two sets of data. Therefore, we can use the classic t-test witch assume equality of the two variances.
# Compute t-test
res <- t.test(Pages ~ Site, data = New,var.equal = TRUE,conf.level = 0.95)
res
##
## Two Sample t-test
##
## data: Pages by Site
## t = -1.5765, df = 18, p-value = 0.1323
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.8510432 0.4065987
## sample estimates:
## mean in group A mean in group B
## 3.777778 5.000000
Interpretation of the result
The p-value of the test is 0.1323, which is greater than the significance level alpha = 0.05. Hence we do not have enough evidences to prove that A’s average Pages is significantly different from B’s average Pages with a p-value = 0.1323.
Group 1: 12 10 11 14 15 9 11 12 13 10 15 10 Group 2: 12 13 17 11 16 13 12 14 17 12 16 18 Group 3: 14 21 17 16 17 22 16 22 19 20 18
3.2. Refer to the above question, use R for the analysis in question 3.1. Use at 0.05 level of significance. 3.3 Provide the ANOVA table.
library(readxl)
Data <- read_excel("C:/Users/leocint/Desktop/Leocint/4th Year College(1st Sem.)/Experimental Design/Activities/5/Data.xlsx")
View(Data)
group_by(Data, Group) %>%
summarise(
count = n(),
mean = mean(Score, na.rm = TRUE),
sd = sd(Score, na.rm = TRUE)
)
## # A tibble: 3 x 4
## Group count mean sd
## <chr> <int> <dbl> <dbl>
## 1 Group 1 12 11.8 2.04
## 2 Group 2 12 14.2 2.42
## 3 Group 3 12 18.2 2.62
ggboxplot(Data, x = "Group", y = "Score",
color = "Group", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
order = c("Group 1", "Group 2", "Group 3"),
ylab = "Score", xlab = "Groups")
# Compute the analysis of variance
res.aov <- aov(Score ~ Group, data = Data)
# Summary of the analysis
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 2 245.2 122.58 21.8 9.25e-07 ***
## Residuals 33 185.6 5.62
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The output includes the columns F value and Pr(>F) corresponding to the p-value of the test.
Interpret the result of one-way ANOVA tests
As the p-value is less than the significance level 0.05, we can conclude that there is a significant difference in the PDSC scores of groups 1, 2 and 3 highlighted with “*" in the model summary.