1. A COVID frontliner wants to estimate the average amount that a resident in Maramag would donate to COVID affected families in Maramag. Twenty residents were randomly selected from the Municipality of Maramag. The 20 randomly residents were contacted by telephone and asked how much they would be willing to donate. Their responses are given below.

30 20 15 10 10 60 20 25 20 30 10 5 50 40 20 10 10 0 20 50

  1. Test at 0.05 level of significance using R.
res = c(30, 20, 15, 10, 10, 60, 20, 25, 20, 30, 10, 5, 50, 40, 20, 10, 10, 0, 20, 50)
mean(res)
## [1] 22.75
# Hypothesis test of the 'less than'
# H0: mu <= 20
t.test(res, alternative="greater", mu = 20, conf.level = 0.95)
## 
##  One Sample t-test
## 
## data:  res
## t = 0.75633, df = 19, p-value = 0.2294
## alternative hypothesis: true mean is greater than 20
## 95 percent confidence interval:
##  16.4629     Inf
## sample estimates:
## mean of x 
##     22.75

Since the p-value is greater than the signficance level 0.05, we fail to reject the null hypothesis. Hence, there is no enough evidence to prove that the situation where Maramag residents would donate more than Php 20.00 is significant.

  1. Refer to the pgviews data, answer the following: 2.1. Refer to the first 20 observations for the manual computation for the significant difference between site A and site B on the number of sites visited. Use 0.05 level of significance. 2.2. Refer to the above question, use R for the analysis in question 2.1. Use at 0.05 level of significance.
library(readxl)
pgviews<- read_excel("C:/Users/leocint/Desktop/Leocint/4th Year College(1st Sem.)/Experimental Design/Activities/4/1/pgviews.xlsx")
View(pgviews)
New<-head(pgviews,20)
New
## # A tibble: 20 x 3
##    Subject Site  Pages
##      <dbl> <chr> <dbl>
##  1       1 B         2
##  2       2 B         6
##  3       3 A         5
##  4       4 B         7
##  5       5 A         3
##  6       6 B         2
##  7       7 B         6
##  8       8 A         1
##  9       9 A         3
## 10      10 A         4
## 11      11 B         6
## 12      12 B         6
## 13      13 B         4
## 14      14 A         5
## 15      15 A         3
## 16      16 A         6
## 17      17 B         6
## 18      18 B         3
## 19      19 A         4
## 20      20 B         7
library(ggpubr)
## Loading required package: ggplot2
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
group_by(New, Site) %>%
  summarise(
    count = n(),
    mean = mean(Pages, na.rm = TRUE),
    sd = sd(Pages, na.rm = TRUE)
  )
## # A tibble: 2 x 4
##   Site  count  mean    sd
##   <chr> <int> <dbl> <dbl>
## 1 A         9  3.78  1.48
## 2 B        11  5     1.90
# Plot weight by group and color by group

ggboxplot(New, x = "Site", y = "Pages", 
          color = "Site", palette = c("#00AFBB", "#E7B800"),
        ylab = "Pages", xlab = "Site")

Preleminary test to check independent t-test assumptions

Assumption 1: Are the two samples independents? Yes, since the samples from A and B are not related.

# Shapiro-Wilk normality test for A's Pages
with(New, shapiro.test(Pages[Site == "A"]))# p = 0.1
## 
##  Shapiro-Wilk normality test
## 
## data:  Pages[Site == "A"]
## W = 0.94976, p-value = 0.6874
# Shapiro-Wilk normality test for B's weights
with(New, shapiro.test(Pages[Site == "B"])) # p = 0.6
## 
##  Shapiro-Wilk normality test
## 
## data:  Pages[Site == "B"]
## W = 0.81668, p-value = 0.0156
res.ftest <- var.test(Pages ~ Site, data = New)
res.ftest
## 
##  F test to compare two variances
## 
## data:  Pages by Site
## F = 0.60957, num df = 8, denom df = 10, p-value = 0.4948
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.1581284 2.6181715
## sample estimates:
## ratio of variances 
##          0.6095679

The p-value of F-test is p = 0.4948. It’s greater than the significance level alpha = 0.05. In conclusion, there is no significant difference between the variances of the two sets of data. Therefore, we can use the classic t-test witch assume equality of the two variances.

# Compute t-test
res <- t.test(Pages ~ Site, data = New,var.equal = TRUE,conf.level = 0.95)
res
## 
##  Two Sample t-test
## 
## data:  Pages by Site
## t = -1.5765, df = 18, p-value = 0.1323
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.8510432  0.4065987
## sample estimates:
## mean in group A mean in group B 
##        3.777778        5.000000

Interpretation of the result

The p-value of the test is 0.1323, which is greater than the significance level alpha = 0.05. Hence we do not have enough evidences to prove that A’s average Pages is significantly different from B’s average Pages with a p-value = 0.1323.

  1. Two medications for the treatment of panic disorder, Medication1 and Medication 2, were compared with a placebo. A random sample of 36 patients was obtained from a listing of about 4,000 panic disorder patients who volunteered to participate in clinical trials. The 36 patients were randomly divided into three groups of equal size. The first group received Medication 1 for 10 weeks, the second group received Medication 2 for 10 weeks, and the third group received a placebo pill for 10 weeks. On week 11, all 36 patients were given the 7-item Panic Disorder Severity Scale (PDSC) which is scored on a 0 to 28 scale (lower scores are better). The PDSC scores are given below.

Group 1: 12 10 11 14 15 9 11 12 13 10 15 10 Group 2: 12 13 17 11 16 13 12 14 17 12 16 18 Group 3: 14 21 17 16 17 22 16 22 19 20 18

3.2. Refer to the above question, use R for the analysis in question 3.1. Use at 0.05 level of significance. 3.3 Provide the ANOVA table.

library(readxl)
Data <- read_excel("C:/Users/leocint/Desktop/Leocint/4th Year College(1st Sem.)/Experimental Design/Activities/5/Data.xlsx")
View(Data)
group_by(Data, Group) %>%
  summarise(
    count = n(),
    mean = mean(Score, na.rm = TRUE),
    sd = sd(Score, na.rm = TRUE)
  )
## # A tibble: 3 x 4
##   Group   count  mean    sd
##   <chr>   <int> <dbl> <dbl>
## 1 Group 1    12  11.8  2.04
## 2 Group 2    12  14.2  2.42
## 3 Group 3    12  18.2  2.62
ggboxplot(Data, x = "Group", y = "Score", 
          color = "Group", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          order = c("Group 1", "Group 2", "Group 3"),
          ylab = "Score", xlab = "Groups")

# Compute the analysis of variance
res.aov <- aov(Score ~ Group, data = Data)
# Summary of the analysis
summary(res.aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Group        2  245.2  122.58    21.8 9.25e-07 ***
## Residuals   33  185.6    5.62                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The output includes the columns F value and Pr(>F) corresponding to the p-value of the test.

Interpret the result of one-way ANOVA tests

As the p-value is less than the significance level 0.05, we can conclude that there is a significant difference in the PDSC scores of groups 1, 2 and 3 highlighted with “*" in the model summary.