Question 1:

A COVID frontliner wants to estimate the average amount that a resident in Maramag would donate to COVID affected families in Maramag. Twenty residents were randomly selected from the Municipality of Maramag. The 20 randomly residents were contacted by telephone and asked how much they would be willing to donate. Their responses are given below.

30 20 15 8 10 40 20 25 20 28 20 25 50 40 20 10 25 25 20 15

1.1 Test at 0.05 level of significance using R for a two-tailed test.
1.2 Test at 0.05 level of significance using R for a an appropriate one-tailed test.

Solution:

x <- c(30, 20, 15, 8, 10, 40, 20, 25, 20, 28, 20, 25, 50, 40, 20, 10, 25, 25, 20, 15)
x

 [1] 30 20 15  8 10 40 20 25 20 28 20 25 50 40 20 10 25 25 20 15

mean(x)

[1] 23.3

1.1

t.test(x, alternative = c("two.sided"), mu = 20, conf.level = 0.95)


    One Sample t-test

data:  x
t = 1.3905, df = 19, p-value = 0.1804
alternative hypothesis: true mean is not equal to 20
95 percent confidence interval:
 18.33282 28.26718
sample estimates:
mean of x 
     23.3

Interpretation:

Based on the result, the p-value is 0.1804. Clearly, it is greater than 0.05 level of significance. With this, we fail to reject the null hypothesis that the mean of x is equal to 20, that is, we do not have evidence that it is different from 20.

1.2

t.test(x, alternative = c("greater"), mu = 20, conf.level = 0.95)


    One Sample t-test

data:  x
t = 1.3905, df = 19, p-value = 0.09022
alternative hypothesis: true mean is greater than 20
95 percent confidence interval:
 19.19641      Inf
sample estimates:
mean of x 
     23.3

Interpretation:

Since the p-value is greater than the significance level 0.05, then we fail to reject the null hypothesis, that is, there is no enough evidence to prove that Maramag residents would donate more than 20.

t.test(x, alternative = c("less"), mu = 20, conf.level = 0.95)


    One Sample t-test

data:  x
t = 1.3905, df = 19, p-value = 0.9098
alternative hypothesis: true mean is less than 20
95 percent confidence interval:
     -Inf 27.40359
sample estimates:
mean of x 
     23.3

Interpretation:

Since p-value is greater than the significance level 0.05, we fail to reject the null hypothesis, that is, there is no enough eveidence to prove that Maramag residents would donate less than 20.

Question 2:

Refer to the pgviews data, answer the following:

2.1 Is the distribution of the data normally distributed?
2.2 Are the variances equal?
2.3 At 0.05 level of significance, does Site A and Site B differ statistically?

Solution:

library(readxl)
pgviews <- read_excel("D:/COLLEGE 4TH YEAR/1st SEMESTER/STAT 55 EXPERIMENTAL DESIGN/MIDTERM/pgviews.xlsx")
pgviews

ian <- head(pgviews, 20)
paged_table(ian)

2.1

shapiro.test(ian$Pages)


    Shapiro-Wilk normality test

data:  ian$Pages
W = 0.92449, p-value = 0.1209

Interpretation:

Since p-value = 0.1209 which is greater than 0.05 level of significance, it is conclusive that we assume normality.

2.2

str(ian)

tibble [20 × 3] (S3: tbl_df/tbl/data.frame)
 $ Subject: num [1:20] 1 2 3 4 5 6 7 8 9 10 ...
 $ Site   : chr [1:20] "B" "B" "A" "B" ...
 $ Pages  : num [1:20] 2 6 5 7 3 2 6 1 3 4 ...

var.test(Pages ~ Site, ian, alternative = "two.sided")


    F test to compare two variances

data:  Pages by Site
F = 0.60957, num df = 8, denom df = 10, p-value = 0.4948
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.1581284 2.6181715
sample estimates:
ratio of variances 
         0.6095679

Interpretation:

Clearly, the p-value = 0.4948 is greater than 0.05 level of significance. With this, we can conclude that there is no significant difference between two variances, meaning the variances are equal.

2.3

t.test(Pages ~ Site, data = ian, var.equal = TRUE)


    Two Sample t-test

data:  Pages by Site
t = -1.5765, df = 18, p-value = 0.1323
alternative hypothesis: true difference in means between group A and group B is not equal to 0
95 percent confidence interval:
 -2.8510432  0.4065987
sample estimates:
mean in group A mean in group B 
       3.777778        5.000000

Interpretation:

The p-value = 0.1323 is greater than 0.05 level of significance. Hence, there is no significant difference between Site A and Site B.

Question 3:

3-1. Two medications for the treatment of panic disorder, Medication1 and Medication 2, were compared with a placebo. A random sample of 36 patients was obtained from a listing of about 4,000 panic disorder patients who volunteered to participate in clinical trials. The 36 patients were randomly divided into three groups of equal size. The first group received Medication 1 for 10 weeks, the second group received Medication 2 for 10 weeks, and the third group received a placebo pill for 10 weeks. On week 11, all 36 patients were given the 7-item Panic Disorder Severity Scale (PDSC) which is scored on a 0 to 28 scale (lower scores are better). The PDSC scores are given below.

Group 1: 12 10 11 14 15 9 11 12 13 10 15 10
Group 2: 12 13 17 11 16 13 12 14 17 12 16 18
Group 3: 14 21 17 16 17 22 16 22 19 20 18 16

3.1 Is the distribution of the data normally distributed?
3.2 Are the variances equal?
3.3 At 0.05 level of significance, does the PDSC scores differ significantly among the different treatments?
3.4 Use R for the conduct of multiple comparison test provided that 3.3 is significant.

Solution:

library(readxl)
Ian_Data_for_Question_3_Midterm_Exam <- read_excel("D:/COLLEGE 4TH YEAR/1st SEMESTER/STAT 55 EXPERIMENTAL DESIGN/MIDTERM/Ian Data for Question 3 Midterm Exam.xlsx")
Ian_Data_for_Question_3_Midterm_Exam

3.1

str(Ian_Data_for_Question_3_Midterm_Exam)

tibble [36 × 2] (S3: tbl_df/tbl/data.frame)
 $ Treatment  : chr [1:36] "Group 1" "Group 1" "Group 1" "Group 1" ...
 $ Observation: num [1:36] 12 10 11 14 15 9 11 12 13 10 ...

shapiro.test(Ian_Data_for_Question_3_Midterm_Exam$Observation)


    Shapiro-Wilk normality test

data:  Ian_Data_for_Question_3_Midterm_Exam$Observation
W = 0.95851, p-value = 0.1933

Interpretation:

It can be seen that the p-value = 0.1933 is greater than 0.05 significance level. This implies that we can assume normality.

3.2

res <- bartlett.test(Observation ~ Treatment, data = Ian_Data_for_Question_3_Midterm_Exam)
res


    Bartlett test of homogeneity of variances

data:  Observation by Treatment
Bartlett's K-squared = 0.67864, df = 2, p-value = 0.7123

Interpretation:

Since the p-value = 0.7123 is greater than 0.05 level of significance, we can say that there is no significant difference between the variances.

3.3

ian1 <- head(Ian_Data_for_Question_3_Midterm_Exam, 36)
paged_table(ian1)

library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(ggpubr)

Warning: package 'ggpubr' was built under R version 4.2.1

Loading required package: ggplot2

Warning: package 'ggplot2' was built under R version 4.2.1

library(gplots)

Warning: package 'gplots' was built under R version 4.2.1


Attaching package: 'gplots'

The following object is masked from 'package:stats':

    lowess

set.seed(123)
ian2 <- dplyr::sample_n(Ian_Data_for_Question_3_Midterm_Exam, 10)
paged_table(ian2)

ian3 <- group_by(ian1, Treatment)%>%
  summarise(count = n(),
            mean = mean(Observation, na.rm = TRUE),
            sd = sd(Observation, na.rm = TRUE)
            )
paged_table(ian3)

ian1$Treatment <- ordered(ian1$Treatment, levels = c("Group 1", "Group 2", "Group 3"))

levels(ian1$Treatment)

[1] "Group 1" "Group 2" "Group 3"

library("ggpubr")
ggboxplot(ian1, x = "Treatment", y = "Observation",
          color = "Treatment", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          order = c("Group 1", "Group 2", "Group 3"),
          ylab = "Observation", xlab = "Treatment")

ggline(ian1, x = "Treatment", y = "Observation",
       add = c("mean_se", "jitter"),
       order = c("Group 1", "Group 2", "Group 3"),
       ylab = "Observation", xlab = "Treatment")

plotmeans(Observation ~ Treatment, data = ian1, frame = FALSE,
          xlab = "Treatment", ylab = "Observation",
          main="Mean Plot with 95% CI")

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

res.aov <- aov(Observation ~ Treatment, data = Ian_Data_for_Question_3_Midterm_Exam)

summary(res.aov)

            Df Sum Sq Mean Sq F value   Pr(>F)    
Treatment    2  245.2  122.58    21.8 9.25e-07 ***
Residuals   33  185.6    5.62                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation:

Based on the results, there are significant differences between the treatments highlighted with “***” in the model summary since the p-value = 9.25e-07 is less than the significance level of 0.05.

3.4

TukeyHSD(res.aov)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Observation ~ Treatment, data = Ian_Data_for_Question_3_Midterm_Exam)

$Treatment
                    diff        lwr      upr     p adj
Group 2-Group 1 2.416667 0.04105701 4.792276 0.0454940
Group 3-Group 1 6.333333 3.95772367 8.708943 0.0000006
Group 3-Group 2 3.916667 1.54105701 6.292276 0.0008425

Interpretation:

It is visible in the results that the difference between Group 2 and Group 1 is significant with an adjusted p-value = 0.0454940. Also, the difference between Group 3 and Group 1 is significant with an adjusted p-value = 0.0000006. Further, the difference between Group 3 and Group 2 is also significant since the adjusted p-value = 0.0008425.

MIDTERM

Ian Duhaylungsod

2022-11-29

Question 1:

Solution:

1.1

Interpretation:

1.2

Interpretation:

Interpretation:

Question 2:

Solution:

2.1

Interpretation:

2.2

Interpretation:

2.3

Interpretation:

Question 3:

Solution:

3.1

Interpretation:

3.2

Interpretation:

3.3

Interpretation:

3.4

Interpretation: