A COVID frontliner wants to estimate the average amount that a resident in Maramag would donate to COVID affected families in Maramag. Twenty residents were randomly selected from the Municipality of Maramag. The 20 randomly residents were contacted by telephone and asked how much they would be willing to donate. Their responses are given below.
y <- c(30, 20, 15, 8, 10, 40, 20, 25, 20, 28, 20, 25, 50, 40, 20, 10, 25, 25, 20, 15)
y
[1] 30 20 15 8 10 40 20 25 20 28 20 25 50 40 20 10 25 25 20 15
mean(y)
[1] 23.3
Null Hypothesis: mean mu = 23.3
Alternative Hypothesis: mean mu > 23.3
t.test(y, alternative = c("two.sided"), mu = 20, conf.level = 0.95)
One Sample t-test
data: y
t = 1.3905, df = 19, p-value = 0.1804
alternative hypothesis: true mean is not equal to 20
95 percent confidence interval:
18.33282 28.26718
sample estimates:
mean of x
23.3
We fail to reject the null hypothesis that the mean of x is equal to 20, given the high p-value = 0.1804.
t.test(y, alternative = c("less"), mu = 20, conf.level = 0.95)
One Sample t-test
data: y
t = 1.3905, df = 19, p-value = 0.9098
alternative hypothesis: true mean is less than 20
95 percent confidence interval:
-Inf 27.40359
sample estimates:
mean of x
23.3
Since p-value is greater than the significance level 0.05, we fail to reject the null hypothesis, since there is insufficient evidence to prove maramag residents would donate less than 20.
t.test(y, alternative = c("greater"), mu = 20, conf.level = 0.95)
One Sample t-test
data: y
t = 1.3905, df = 19, p-value = 0.09022
alternative hypothesis: true mean is greater than 20
95 percent confidence interval:
19.19641 Inf
sample estimates:
mean of x
23.3
Since p-value is greater than the significance level 0.05, we fail to reject the null hypothesis, since there is insufficient evidence to prove maramag residents would donate more than 20.
Refer to the pgviews data, answer the following:
library(readxl)
pgviews <- read_excel("C:/Users/Winelyn/Desktop/pgviews.xlsx")
View(pgviews)
library(rmarkdown)
paged_table(pgviews)
Null Hypothesis: The data is normally distributed.
Alternative Hypothesis: The data is not normally distributed.
wine <- head(pgviews, 20)
paged_table(wine)
shapiro.test(wine$Pages)
Shapiro-Wilk normality test
data: wine$Pages
W = 0.92449, p-value = 0.1209
Since p-value = 0.1209 greater than 0.05, it is conclusive that we assume normality.
str(wine)
tibble [20 × 3] (S3: tbl_df/tbl/data.frame)
$ Subject: num [1:20] 1 2 3 4 5 6 7 8 9 10 ...
$ Site : chr [1:20] "B" "B" "A" "B" ...
$ Pages : num [1:20] 2 6 5 7 3 2 6 1 3 4 ...
var.test(Pages ~ Site, wine, alternative = "two.sided")
F test to compare two variances
data: Pages by Site
F = 0.60957, num df = 8, denom df = 10, p-value = 0.4948
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.1581284 2.6181715
sample estimates:
ratio of variances
0.6095679
The p-value is greater than 0.05. We can conclude that there is no significant difference between two variances, meaning the variances are equal.
t.test(Pages ~ Site, wine, var.equal = TRUE)
Two Sample t-test
data: Pages by Site
t = -1.5765, df = 18, p-value = 0.1323
alternative hypothesis: true difference in means between group A and group B is not equal to 0
95 percent confidence interval:
-2.8510432 0.4065987
sample estimates:
mean in group A mean in group B
3.777778 5.000000
Since p-value is greater than the significance level 0.05. Thus, there is no significant difference that Site A and Site B differ statistically.
Two medications for the treatment of panic disorder, Medication1 and Medication 2, were compared with a placebo. A random sample of 36 patients was obtained from a listing of about 4,000 panic disorder patients who volunteered to participate in clinical trials. The 36 patients were randomly divided into three groups of equal size. The first group received Medication 1 for 10 weeks, the second group received Medication 2 for 10 weeks, and the third group received a placebo pill for 10 weeks. On week 11, all 36 patients were given the 7-item Panic Disorder Severity Scale (PDSC) which is scored on a 0 to 28 scale (lower scores are better). The PDSC scores are given below.
Group 1: 12 10 11 14 15 9 11 12 13 10 15 10
Group 2: 12 13 17 11 16 13 12 14 17 12 16 18
Group 3: 14 21 17 16 17 22 16 22 19 20 18 16
library(readxl)
winedata <- read_excel("C:/Users/Winelyn/Desktop/winedata.xlsx")
View(winedata)
str(winedata)
tibble [36 × 2] (S3: tbl_df/tbl/data.frame)
$ Treatment : chr [1:36] "Group 1" "Group 1" "Group 1" "Group 1" ...
$ Observation: num [1:36] 12 10 11 14 15 9 11 12 13 10 ...
paged_table(winedata)
shapiro.test(winedata$Observation)
Shapiro-Wilk normality test
data: winedata$Observation
W = 0.95851, p-value = 0.1933
Since the p-value is greater than the significance level 0.05.Thus, the distribution of the data are not significantly different from normal distribution, we can assume normality.
res <- bartlett.test(Observation ~ Treatment, data = winedata)
res
Bartlett test of homogeneity of variances
data: Observation by Treatment
Bartlett's K-squared = 0.67864, df = 2, p-value = 0.7123
Since the p-value is greater than the significance level 0.05.Thus, there is no significant difference between the variances.
wine1 <- head(winedata, 36)
paged_table(wine1)
library(dplyr)
Warning: package 'dplyr' was built under R version 4.2.1
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library("ggpubr")
Warning: package 'ggpubr' was built under R version 4.2.1
Loading required package: ggplot2
Warning: package 'ggplot2' was built under R version 4.2.1
library("gplots")
Warning: package 'gplots' was built under R version 4.2.1
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
set.seed(123)
win <- dplyr::sample_n(winedata, 10)
paged_table(win)
group_by(wine1, Treatment) %>%
summarise(
count = n(),
mean = mean(Observation, na.rm = TRUE),
sd = sd(Observation, na.rm = TRUE)
)
# A tibble: 3 × 4
Treatment count mean sd
<chr> <int> <dbl> <dbl>
1 Group 1 12 11.8 2.04
2 Group 2 12 14.2 2.42
3 Group 3 12 18.2 2.62
wine1$Treatment <- ordered(wine1$Treatment,
levels = c("Group 1", "Group 2", "Group 3"))
levels(wine1$Treatment)
[1] "Group 1" "Group 2" "Group 3"
library("ggpubr")
ggboxplot(wine1, x = "Treatment", y = "Observation",
color = "Treatment", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
order = c("Group 1", "Group 2", "Group 3"),
ylab = "Observation", xlab = "Treatment")
ggline(wine1, x = "Treatment", y = "Observation",
add = c("mean_se", "jitter"),
order = c("Group 1", "Group 2", "Group 3"),
ylab = "Observation", xlab = "Treatment")
plotmeans(Observation ~ Treatment, data = wine1, frame = FALSE,
xlab = "Treatment", ylab = "Observation",
main="Mean Plot with 95% CI")
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
res.aov <- aov(Observation ~ Treatment, data = winedata)
summary(res.aov)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 2 245.2 122.58 21.8 9.25e-07 ***
Residuals 33 185.6 5.62
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since the p-value is less than the significance level of 0.05. There is a significant differences between the treatments highlighted with “***” in the model summary.
TukeyHSD(res.aov)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Observation ~ Treatment, data = winedata)
$Treatment
diff lwr upr p adj
Group 2-Group 1 2.416667 0.04105701 4.792276 0.0454940
Group 3-Group 1 6.333333 3.95772367 8.708943 0.0000006
Group 3-Group 2 3.916667 1.54105701 6.292276 0.0008425
Since the adjusted p-value is 0.0454940. Hence, the difference between Group 2 and Group 1 is significant. The difference between Group 3 and Group 1 is significant, with an adjusted p-value of 0.0000006. Lastly, the difference between Group 3 and Group 2 is significant, with an adjusted p-value of 0.0008425.