Assignment: Hypothesis Testing in R

Author

Raymond Peck

Problem 1: Student Sleep Analysis

The average time a student sleeps per night is summarized by the following data points:

6.5, 7.5, 9.0, 5.0, 6.5, 5.0, 6.0, 8.0, 4.5, 8.5, 6.5, 6.0

We will test the hypothesis:

  • Null hypothesis \(H_0\): Students get an average of 8 hours of sleep per night (\(\mu = 8\)).
  • Alternative hypothesis \(H_a\): Students get less than 8 hours of sleep per night (\(\mu < 8\)).

Significance level: \(\alpha = 0.05\)

(done together in class)

# Step 1: Store the data in a vector
sleep_vector <- c(6.5, 7.5, 9.0, 5.0, 6.5, 5.0, 6.0, 8.0, 4.5, 8.5, 6.5, 6.0)


# Step 2: Calculate the sample mean and sample standard deviation
# Add code here to calculate mean and standard deviation
sample_mean <- mean(sleep_vector)

sample_sd <-sd(sleep_vector)

# Step 3: Define the hypothesized mean and sample size
mu_0 <- 8
sample_size <- length(sleep_vector)

# Step 4: Calculate the test statistic
# Add code to calculate the t-statistic
test_statistic <- (sample_mean - mu_0)/(sample_sd/sqrt(sample_size))


# Step 5: Compute the p-value for a one-tailed test
p_value <- pt(test_statistic, sample_size-1)


# Step 6: Make a conclusion based on p-value and significance level

#Reject the null hypothesis and conclude that the true number of sleep hours is less than 8

Problem 2

The average daily caffeine consumption (in mg) of a group of college students is recorded as follows:

400, 250, 0, 300, 0, 600, 250, 200, 0, 275

Perform a hypothesis test to determine if the students consume more than 260 mg of caffeine on average.

  • State the null and alternative hypotheses.
  • Use \(\alpha = 0.05\) as the significance level.
  • Compute the test statistic, p-value, and make a conclusion.

Use the provided data and fill in the steps based on what you’ve learned from the first scenario.

# 1:State the null and alternative hypotheses
# null hypotheses: mu_0 = 260 mg of caffeine
# alternative hypotheses: mu_a > 260 mg of caffeine

## Step 1: Store the data in a vector
student_caffeine <- c(400, 250, 0, 300, 0, 600, 250, 200, 0, 275)


# Step 2: Calculate the sample mean and sample standard deviation
# Add code here to calculate mean and standard deviation
caffeine_mean <- mean(student_caffeine)

caffeine_sd <-sd(student_caffeine)

# Step 3: Define the hypothesized mean and sample size
mu_0_caffeine <- 260
caffeine_size <- length(student_caffeine)

# Step 4: Calculate the test statistic
# Add code to calculate the t-statistic
caffeine_tstat <- (caffeine_mean - mu_0_caffeine)/(caffeine_sd/sqrt(caffeine_size))


# Step 5: Compute the p-value for a one-tailed test
caffeine_p_value <- 1- pt(caffeine_tstat, caffeine_size-1)


# Step 6: Make a conclusion based on p-value and significance level

#Fail to reject the null hypothesis and conclude that students do not drink more than 260 mg of caffeine

Problem 3

A two sample t-test is used for a hypothesis test that the mean of two groups are equal. If the two groups are equal size, the test statistic can be found using the formula: \[ t = \frac{\bar{x}_A - \bar{x}_B}{s_p \sqrt{\frac{2}{n}}} \] where \(s_p\) is the pooled standard deviation: \[ s_p = \sqrt{\frac{(n - 1)s_A^2 + (n - 1)s_B^2}{2n - 2}} \] \(\bar{x}_A\) and \(\bar{x}_B\) are the sample means for groups A and B respectively and \(s_A\) and \(s_B\) are sample standard deviations of the respective groups.

The table below provides the test scores of students in two groups:

Group A Group B
85 88
87 84
83 89
86 87
84 85

For a two sample t-test, the null and alternative hypotheses are

  • Null hypothesis \(H_0\): The mean scores of the two groups are equal (\(\mu_A = \mu_B\)).
  • Alternative hypothesis \(H_a\): The mean scores of the two groups are not equal (\(\mu_A \neq \mu_B\)).

We will use significance level: \(\alpha = 0.05\)

Perform a two-sample balanced t-test to compare the means of the two groups. 4. Compute the p-value for a two-tailed test. 5. Compare the p-value with \(\alpha\) and make a conclusion.

# Step 1: Store the data in a vector
group_a <- c(85, 87, 83, 86, 84)

group_b <- c(88, 84, 89, 87, 85)


# Step 2: Calculate the sample mean and sample standard deviation
# Add code here to calculate mean and standard deviation
x_a <- mean(group_a)

x_b <- mean(group_b)

s_a <- sd(group_a)

s_b <- sd(group_b)

s_p <- sqrt((((5-1)*s_a^2) + ((5-1)*s_b^2))/((2*5)-2))

# Step 3: Define the hypothesized mean and sample size
# mu_a and mu_b are equal
n <- length(group_a)

# Step 4: Calculate the test statistic
# Add code to calculate the t-statistic
ab_tstat <- ((x_a-x_b)/(s_p*(sqrt(2/n))))


# Step 5: Compute the p-value for a one-tailed test
p_ab <- 2*(1-pt(abs(ab_tstat), n-1))


# Step 6: Make a conclusion based on p-value and significance level

#Reject the null hypothesis and conclude that the means of group A and group B are equal to each other