Lab 03

Author
Affiliation

Moisieiev Vasyl

Kyiv School of Economics

Part 1: IMS Exercises

Exercise 1

  1. Proportion - because it deals with the precentege of a population exhibiting a certain characteristic.

  2. Mean - because it deals with changes in average revenue.

  3. Proportion - because it deals with percentage of people.

  4. Proportion - because it deals with percentage of people.

  5. Mean - because it deals with average number of times user used service.

Exercise 2

  1. Null Hypothesis: There is no difference between when calories were displayed and when they were not displayed. \(H_0: \mu = 1100\)
    Alternative Hypothesis: There is a difference between when calories were displayed and when they were not displayed. \(H_1: \mu \neq 1100\)

  2. Null Hypothesis: There is no change in the average GRE Verbal Reasoning score from 2004-2021. \(H_0: \mu = 462\)
    Alternative Hypothesis: There is a change in the average GRE Verbal Reasoning score from 2004-2007 to 2021. \(H_1: \mu \neq 462\)

Part 2: Paul the Octopus

Exercise 1

check_paul_criterion <- function(n, k, alpha){
  
 p_value = 2 * min(
  pbinom(k, n, 0.5),
  1 - pbinom(k - 1, n, 0.5)
 ) 
 
 is_rejected = p_value < alpha
  
 return(list(p_value = p_value, is_rejected = is_rejected))
}

check_paul_criterion(14, 12, 0.05)
$p_value
[1] 0.01293945

$is_rejected
[1] TRUE

Exercise 2

We can say that Paul the Octopus is a good predictor because the p-value is less than 0.05, which means that we can reject the null hypothesis. So he is predictiing the outcome with more then 50% accuracy.

Exercise 3

library(tibble)
library(ggplot2)

n <- 14
k <- 12
alpha <- 0.05

critical_value <- qbinom(1- alpha, n, 0.5)

x <- 0:n
prob <- dbinom(x, n, 0.5)

crit_reg <- x >= critical_value

data <- tibble(x = x, prob = prob, crit_reg = crit_reg)

ggplot(data, aes(x = x, y = prob)) +
  geom_bar(stat = "identity", aes(fill = as.factor(crit_reg)), alpha = 0.6) +  
  scale_fill_manual(values = c("TRUE" = "red", "FALSE" = "blue")) +  
  labs(title = "Critical Region of the Paul the Octopus Test",
       x = "Number of Correct Predictions",
       y = "Probability") +
  theme_minimal()

Exercise 4

library(tidyverse)

two_sided_criterion_nonsym <- function(n, mu, alpha) {
  
  c2 <- qbinom(1 - alpha / 2, size = n, prob = mu) + 1
  
  c1 <- qbinom(alpha / 2, size = n, prob = mu) - 1
  
  return(c(c1, c2))
}

mu_grid <- seq(0, 1, by = 0.001)

mu_no_rejection <- tibble(mu_h0 = mu_grid) %>%
  rowwise() %>%
  filter({
    crit <- two_sided_criterion_nonsym(14, mu_h0, alpha = 0.05)
    critical_value > crit[1] && critical_value < crit[2]
  }) %>%
  pull(mu_h0)



cat(sprintf("95%% confidence interval: %.3f -- %.3f", min(mu_no_rejection), max(mu_no_rejection)))
95% confidence interval: 0.419 -- 0.916