Homework 4 — Functions + Hypothesis Testing

This homework has two parts. Part 1 asks you to write your own functions. Part 2 applies hypothesis tests to two real scenarios.

Part 1 — Functions

# Q1. Write a function called calculate_area_of_rectangle that takes two parameters
#     (length, width) and returns the area (area = length * width).
#     Test it with 2 different inputs.
calculate_area_of_rectangle <- function(length, width) {
  length * width
}

calculate_area_of_rectangle(5, 3)

## [1] 15

calculate_area_of_rectangle(10, 2)

## [1] 20

# Q2. Write a function called calculate_average that takes a numeric vector and returns
#     its average. Handle the case of an empty vector by printing a message.
#     (Hint: use if/else and length(x) == 0)
calculate_average <- function(x) {
  if (length(x) == 0) {
    print("Cannot compute the average: the vector is empty.")
  } else {
    sum(x) / length(x)
  }
}

calculate_average(c(4, 8, 15, 16, 23, 42))

## [1] 18

calculate_average(numeric(0))

## [1] "Cannot compute the average: the vector is empty."

# Q3. Write a function called check_even_odd that takes an integer and prints whether
#     it is "Even" or "Odd".
#     Test it on 14 and 27.
#     (Hint: use the %% modulus operator)
check_even_odd <- function(n) {
  if (n %% 2 == 0) {
    print("Even")
  } else {
    print("Odd")
  }
}

check_even_odd(14)

## [1] "Even"

check_even_odd(27)

## [1] "Odd"

Part 2 — Hypothesis Testing

Problem 1 — Two-Proportion z-test

In 2017, of the 144,790 students who took the AP Biology exam, 84,200 were female. That same year, of the 211,693 students who took the AP Calculus AB exam, 102,598 were female.

Is there enough evidence to show that the proportion of female students taking the Biology exam is HIGHER than the proportion taking the Calculus AB exam? Test at the 5% level.

State your hypotheses:

H₀: p₁ = p₂ (the proportion of females is the same for both exams)
H₁: p₁ > p₂ (the proportion of females taking Biology is higher than Calculus AB)

where p₁ = proportion of female Biology students and p₂ = proportion of female Calculus AB students.

# Q4. Run the appropriate two-proportion test.
#     (Hint: prop.test(c(84200, 102598), c(144790, 211693), alternative = "greater"))
bio_calc_test <- prop.test(c(84200, 102598), c(144790, 211693),
                           alternative = "greater")
bio_calc_test

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(84200, 102598) out of c(144790, 211693)
## X-squared = 3234.9, df = 1, p-value < 2.2e-16
## alternative hypothesis: greater
## 95 percent confidence interval:
##  0.09408942 1.00000000
## sample estimates:
##    prop 1    prop 2 
## 0.5815319 0.4846547

# Q5. What is the p-value? At α = 0.05, do you reject H₀?
bio_calc_test$p.value

## [1] 0

# p-value is essentially 0 (< 0.05), so we REJECT H0.

Q6. Write your conclusion in plain English (one or two sentences):

Since the p-value is far below 0.05, we reject the null hypothesis. There is strong evidence that the proportion of female students taking the AP Biology exam (about 58.1%) is higher than the proportion taking the AP Calculus AB exam (about 48.5%).

Problem 2 — Paired t-test

A vitamin K shot is given to infants soon after birth. Researchers want to see if how the infants are handled can reduce the pain. They measured how long (in seconds) the infant cried after the shot. One group received the shot the conventional way; the other group received it while the mother held the infant.

Is there enough evidence to show that infants cried LESS on average when held by their mothers vs. the conventional method? Test at the 5% level.

Old <- c(63, 0, 2, 46, 33, 33, 29, 23, 11, 12, 48, 15, 33, 14, 51,
         37, 24, 70, 63, 0, 73, 39, 54, 52, 39, 34, 30, 55, 58, 18)

New <- c(0, 32, 20, 23, 14, 19, 60, 59, 64, 64, 72, 50, 44, 14, 10,
         58, 19, 41, 17, 5, 36, 73, 19, 46, 9, 43, 73, 27, 25, 18)

State your hypotheses:

where the difference is defined as d = Old − New (conventional minus held by mother).

H₀: μ_d = 0 (holding the infant makes no difference in crying time)
H₁: μ_d > 0 (infants cry less when held, so Old crying time is greater than New)

# Q7. Run a paired t-test.
#     (Hint: t.test(Old, New, paired = TRUE))
cry_test <- t.test(Old, New, paired = TRUE, alternative = "greater")
cry_test

## 
##  Paired t-test
## 
## data:  Old and New
## t = 0.028519, df = 29, p-value = 0.4887
## alternative hypothesis: true mean difference is greater than 0
## 95 percent confidence interval:
##  -9.762971       Inf
## sample estimates:
## mean difference 
##       0.1666667

# Q8. What is the p-value? At α = 0.05, do you reject H₀?
cry_test$p.value

## [1] 0.4887216

# p-value > 0.05, so we FAIL TO REJECT H0.

Q9. Write your conclusion in plain English. Does the data support the claim that the new method reduces crying time?

The p-value is about 0.49, which is much larger than 0.05, so we fail to reject the null hypothesis. The mean crying times are almost identical (35.3 vs 35.1 seconds), so the data does not support the claim that holding the infant reduces crying time.

Homework 4 — Functions + Hypothesis Testing

Dev Narang

Part 1 — Functions

Part 2 — Hypothesis Testing

Problem 1 — Two-Proportion z-test

Problem 2 — Paired t-test