This homework has two parts. Part 1 asks you to write your own functions. Part 2 applies hypothesis tests to two real scenarios.


Part 1 — Functions

# Q1. Write a function called calculate_area_of_rectangle that takes two parameters
#     (length, width) and returns the area (area = length * width).
#     Test it with 2 different inputs.
calculate_area__of_rectangle <- function(length, width){
  if (length > 0 & width > 0) {
  area <- length * width
  return(area)
   } else {
  print("dimensions must be positive")
   }
}

test1 <- calculate_area__of_rectangle(5,4)
print(test1)
## [1] 20
test2 <- calculate_area__of_rectangle(9,8)
print(test2)
## [1] 72
test3 <- calculate_area__of_rectangle(9,2)
print(test3)
## [1] 18
# Q2. Write a function called calculate_average that takes a numeric vector and returns
#     its average. Handle the case of an empty vector by printing a message.
#     (Hint: use if/else and length(x) == 0)
calculate_average <- function(x){
  if (length(x) == 0){
    print("the vector is empty")
  } else {
    return(mean(x))
  }
}

test1 <- calculate_average(c(10,20,30,40))
print(test1)
## [1] 25
test2 <- calculate_average(c(5,8,12))
print(test2) 
## [1] 8.333333
test3 <- calculate_average(c())
## [1] "the vector is empty"
print(test3)
## [1] "the vector is empty"
# Q3. Write a function called check_even_odd that takes an integer and prints whether
#     it is "Even" or "Odd".
#     Test it on 14 and 27.
#     (Hint: use the %% modulus operator)

check_even_odd <-function(x) {
  if (x %% 2 == 0) {
    print("Even")
  } else {
  print("Odd")
  }
}

test1 <- check_even_odd(14)
## [1] "Even"
test2 <- check_even_odd(27)
## [1] "Odd"

Part 2 — Hypothesis Testing

Problem 1 — Two-Proportion z-test

In 2017, of the 144,790 students who took the AP Biology exam, 84,200 were female. That same year, of the 211,693 students who took the AP Calculus AB exam, 102,598 were female.

Is there enough evidence to show that the proportion of female students taking the Biology exam is HIGHER than the proportion taking the Calculus AB exam? Test at the 5% level.

State your hypotheses:

  • H₀: p₁ = p₂
  • H₁: p₁ > p₂

P₁ = female’s proportion than take Ap Biology P₂ = female’s proportion than take AP Calculus AB

# Q4. Run the appropriate two-proportion test.
#     (Hint: prop.test(c(84200, 102598), c(144790, 211693), alternative = "greater"))
prop.test(c(84200, 102598), 
        c(144790, 211693),
        alternative = "greater")
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(84200, 102598) out of c(144790, 211693)
## X-squared = 3234.9, df = 1, p-value < 2.2e-16
## alternative hypothesis: greater
## 95 percent confidence interval:
##  0.09408942 1.00000000
## sample estimates:
##    prop 1    prop 2 
## 0.5815319 0.4846547

Q5. What is the p-value? At α = 0.05, do you reject H₀?

p-value < 2.2e-16 α = 0.05 Since the p-value is less than 0.05, we reject H₀

Q6. Write your conclusion in plain English (one or two sentences): There is sufficient evidence at the 5% significance level to conclude that the proportion of female student taking the AP Biology exam is higher than the proportion of female students taking the AP Calculus AB exam.


Problem 2 — Paired t-test

A vitamin K shot is given to infants soon after birth. Researchers want to see if how the infants are handled can reduce the pain. They measured how long (in seconds) the infant cried after the shot. One group received the shot the conventional way; the other group received it while the mother held the infant.

Is there enough evidence to show that infants cried LESS on average when held by their mothers vs. the conventional method? Test at the 5% level.

Old <- c(63, 0, 2, 46, 33, 33, 29, 23, 11, 12, 48, 15, 33, 14, 51,
         37, 24, 70, 63, 0, 73, 39, 54, 52, 39, 34, 30, 55, 58, 18)

New <- c(0, 32, 20, 23, 14, 19, 60, 59, 64, 64, 72, 50, 44, 14, 10,
         58, 19, 41, 17, 5, 36, 73, 19, 46, 9, 43, 73, 27, 25, 18)

State your hypotheses:

  • H₀: μ_d = 0
  • H₁: μ_d > 0
# Q7. Run a paired t-test.
#     (Hint: t.test(Old, New, paired = TRUE))


t.test(Old, New, paired = TRUE, alternative = "greater")
## 
##  Paired t-test
## 
## data:  Old and New
## t = 0.028519, df = 29, p-value = 0.4887
## alternative hypothesis: true mean difference is greater than 0
## 95 percent confidence interval:
##  -9.762971       Inf
## sample estimates:
## mean difference 
##       0.1666667
   # Q8. What is the p-value? At α = 0.05, do you reject H₀?
# p-value = 0.4887
#α = 0.05
# Do not reject H₀

Q9. Write your conclusion in plain English. Does the data support the claim that the new method reduces crying time? There is not enough evidence at the 5% significance level to conclude that the new method reduces infant crying time. Therefore, we do not reject the null hypothesis