This homework has two parts. Part 1 asks you to write your own functions. Part 2 applies hypothesis tests to two real scenarios.


Part 1 — Functions

# Q1. Write a function called calculate_area_of_rectangle that takes two parameters
#     (length, width) and returns the area (area = length * width).
#     Test it with 2 different inputs.

calculate_area_of_rectangle <- function(length, width) {
  length*width
  

}
calculate_area_of_rectangle( 5,24)
## [1] 120
calculate_area_of_rectangle( 3,11)
## [1] 33
area_check_1 <- 5*24
area_check_2 <- 3*11

area_check_1
## [1] 120
area_check_2
## [1] 33
# Q2. Write a function called calculate_average that takes a numeric vector and returns
#     its average. Handle the case of an empty vector by printing a message.
#     (Hint: use if/else and length(x) == 0)

calculate_average <- function(x) {
  if (length(x) == 0) {
    print("This vector is empty.") # using print statement to be careful
  } else {
    mean(x)
  }  
  
}

calculate_average(c(10, 20, 30, 40)) # should give 25, from exercise activity
## [1] 25
calculate_average(c())
## [1] "This vector is empty."
# Q3. Write a function called check_even_odd that takes an integer and prints whether
#     it is "Even" or "Odd".
#     Test it on 14 and 27.
#     (Hint: use the %% modulus operator)

check_even_odd <- function(x) {
  if (x %% 2 == 0) {
    print("Even") 
  } else if (x %% 2 == 1) {     # wanted to evaulate to odd; 
                                # wanted else if, too
                                # } else {
                                #  print("odd") 
                                # }
                                # did not need else if
    print("Odd")
  }
  
}

check_even_odd(14)
## [1] "Even"
check_even_odd(27)
## [1] "Odd"

Part 2 — Hypothesis Testing

Problem 1 — Two-Proportion z-test

In 2017, of the 144,790 students who took the AP Biology exam, 84,200 were female. That same year, of the 211,693 students who took the AP Calculus AB exam, 102,598 were female.

Is there enough evidence to show that the proportion of female students taking the Biology exam is HIGHER than the proportion taking the Calculus AB exam? Test at the 5% level.

State your hypotheses:

Let p₁ be the proportion of students who took the AP Biology exam that were female. Let p₂ be the proportion of students who took the AP Calculus AB exam that were female.

  • H₀: p₁ _ (equal, = ) __ p₂
  • H₁: p₁ _ (greater than, > ) __ p₂
# Q4. Run the appropriate two-proportion test.
#     (Hint: prop.test(c(84200, 102598), c(144790, 211693), alternative = "greater"))

prop.test(c(84200, 102598), c(144790, 211693), alternative = "greater")
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(84200, 102598) out of c(144790, 211693)
## X-squared = 3234.9, df = 1, p-value < 2.2e-16
## alternative hypothesis: greater
## 95 percent confidence interval:
##  0.09408942 1.00000000
## sample estimates:
##    prop 1    prop 2 
## 0.5815319 0.4846547
# Q5. What is the p-value? At α = 0.05, do you reject H₀?

# The p-value is given as p-value < 2.2e-16. It is essentially a p-value of zero. This value is much smaller than 0.05. At α = 0.05, we reject the null hypothesis, H₀, that p₁= the proportion of students who took the AP Biology exam that were female is equal to p₂, the proportion of students who took the AP Calculus AB exam that were female. 
# At this p-value, we reject the null hypothesis in favor of  H₁ that the proportion of female students taking the AP Biology exam is greater than the proportion of female students taking the AP Calculus exam.

Q6. Write your conclusion in plain English (one or two sentences):

We have strong evidence that the proportion of students taking the AP Biology exam who are female is greater than the proportion of students taking the AP Calculus exam who are female. The probability of seeing a difference in favor of the female proportion of AP Biology exam students being larger than the female proportion of AP Calculus exam students as large as , or larger than what we saw is almost zero, assuming the two true proportions are exactly equal.


Problem 2 — Paired t-test

A vitamin K shot is given to infants soon after birth. Researchers want to see if how the infants are handled can reduce the pain. They measured how long (in seconds) the infant cried after the shot. One group received the shot the conventional way; the other group received it while the mother held the infant.

Is there enough evidence to show that infants cried LESS on average when held by their mothers vs. the conventional method? Test at the 5% level.

Old <- c(63, 0, 2, 46, 33, 33, 29, 23, 11, 12, 48, 15, 33, 14, 51,
         37, 24, 70, 63, 0, 73, 39, 54, 52, 39, 34, 30, 55, 58, 18)

New <- c(0, 32, 20, 23, 14, 19, 60, 59, 64, 64, 72, 50, 44, 14, 10,
         58, 19, 41, 17, 5, 36, 73, 19, 46, 9, 43, 73, 27, 25, 18)

State your hypotheses:

  • H₀: μ_d = μ_old - μ_new =,equals 0 There is no difference in mean crying time between conventional method and mothers holding their infants

  • H₁: μ_d = μ_old - μ_new >,greater than 0 The difference in mean crying time between conventional method and mothers holding their infants is greater than zero

# Q7. Run a paired t-test.
#     (Hint: t.test(Old, New, paired = TRUE))

t.test(Old, New, paired = TRUE, alternative = "greater")
## 
##  Paired t-test
## 
## data:  Old and New
## t = 0.028519, df = 29, p-value = 0.4887
## alternative hypothesis: true mean difference is greater than 0
## 95 percent confidence interval:
##  -9.762971       Inf
## sample estimates:
## mean difference 
##       0.1666667
# Q8. What is the p-value? At α = 0.05, do you reject H₀?


# The p-value = 0.4887 (using the one-sided test to match our alternative hypothesis). Since 0.4887 is much greater than α = 0.05, we fail to reject H₀.

Q9. Write your conclusion in plain English. Does the data support the claim that the new method reduces crying time?

Since this p-value is large, p-value = 0.4887, the data does not provide enough evidence for us to reject the null hypothesis and support the claim that infants held by mothers reduce crying time. The observed difference could be due to chance alone.

The p-value = 0.4887 is much greater than α = 0.05. There’s about a 49% chance of seeing a difference in mean crying times this large or larger between infants that were not held and infants held by their mothers purely due to random sampling variation, assuming that the null hypothesis that there is no real difference in crying time, is true.