Suppose \(Y_1,...,Y_n \stackrel{i.i.d.}{\sim} N(\mu,\sigma^2 = 36)\). We want to test:
\[H_0: \mu = 3\]\[H_a: \mu > 3\]
We will reject \(H_0\) using the test statistic:
\[T = \frac{\bar Y -3}{\sigma/\sqrt{n}}\]
and will reject when \(T > c\).
A. Suppose we want to set \(c\) such that \(\alpha = 0.05\). Find \(c\).
B. Find an expression for the power as a function of \(n\) and the “effect size”, \(d=(\mu_a - \mu_0)/\sigma\).
C. Consider an effect size of \(d=1\). Determine the sample size necessary to achieve a power of at least 0.8.
D. Plot the analytic power function as a function of the effect size \(d\) for \(n = 20\).
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.2
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'readr' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.2.0
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 4.0.1 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(purrrfect)
Attaching package: 'purrrfect'
The following objects are masked from 'package:base':
replicate, tabulate
ggplot()+geom_function(fun = \(x) 1-pnorm(1.645-sqrt(20)*x))+labs(x='effect size', y ='rejection rate')+geom_hline(aes(yintercept =0.05), linetype=2)
E. Plot the analytic power function as a function of \(n\) if \(d = 1\).
ggplot()+geom_function(fun = \(n) 1-pnorm(1.645-sqrt(n)*1))+labs(x='sample size', y ='power')+xlim(c(1,20))
F. Although not necessary, use a simulation study to approximate the values of \(c\) for each \(n\in \{5,10,...,50\}\). Compare the simulated values of \(c\) to their analytic counterparts.
# sim dist of test stat under null where mu=3mu0 <-3(null_study <-parameters(~n, seq(5,50, by =5) ) %>%add_trials(10000) %>%mutate(y_sample =map(n, .f = \(n) rnorm(n, mean = mu0, sd =6)),T =pmap_dbl(list(n,y_sample), \(n,y) ((mean(y)- mu0) / (6/sqrt(n)))) )) %>%head()
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.
G. Simulate the rejection rates of this test using the analytic rejection regions for \(\mu\in\{3,3.5,4,4.5\}\). Plot the rejection rates as a function of \(n\) faceted by the effect size \(d\).
(various_test_study <-parameters(~n, ~mu_true,seq(5,50, by =5), c(3,3.5,4,4.5) ) %>%add_trials(10000) %>%mutate(y_sample =pmap(list(mu_true,n), .f = \(m,n) rnorm(n, mean = m, sd =6)),T =pmap_dbl(list(n,y_sample), \(n,y) ((mean(y)-3) / (6/sqrt(n)))) ) %>%inner_join(., crit_values, by ='n') %>%mutate( reject =ifelse(T > c, 1,0))) %>%head()
ggplot(data = rejection_rates)+geom_line(aes(x=n, y = prob_reject))+facet_wrap(~mu_true)+geom_hline(aes(yintercept=0.05), linetype =2)
Problem 1
Suppose \(Y_1,...,Y_n\) are an i.i.d. \(UNIF(0,\theta)\) sample. Consider two different tests of \(H_0: \theta = 1\) versus \(H_a: \theta > 1\).
Test 1 rejects \(H_0\) if \(Y_{(n)}> c_1\).
Test 2 rejects \(H_0\) if \(\bar Y> c_2\).
We want both tests to have 5% type-I error rate.
A. Show that \(c_1\) can be found analytically to be \(0.95^{1/n}\).
Find a general expression for the power function of Test 1 as a function of \(n\) and \(\theta_a\). Plot curve as a function of \(n\) for \(\theta = 1.2.\)
# Plotting the analytic power curve as a function of n for theta = 1.2ggplot() +geom_function(fun = \(n) 1- (0.95/ (1.2^n)), color ="blue") +xlim(c(1, 50)) +labs(x ="Sample Size (n)",y ="Power", )+theme_classic(base_size =14)
C. Explain why an exact value of \(c_2\) cannot be found analytically and must be simulated.
We cannot find c2 analytically because the exact sampling distribution of the sample mean for uniform variables is very complex.
D. Simulate values of \(c_2\) that yield 5% type-I error rates for Test 2, for \(n \in \{5,10,...,50\}\).
theta_0 <-1(c2_sim_data <-parameters(~n, seq(5, 50, by =5)) %>%add_trials(10000) %>%mutate(Ysample =map(n, .f = \(n) runif(n, min =0, max = theta_0)),T2 =map_dbl(Ysample, mean) ) %>%summarize(c2_simulated =quantile(T2, 0.95),.by = n )) %>%head()
E. Use a simulation study to compare the sizes and the power curves of the two tests as a function of \(n\) for \(\theta_{true} \in \{1,1.1,1.2,1.3\}.\) Use analytic rejection regions for \(Y_{(n)}\) and simulated rejection regions for \(\bar Y\), but simulate the rejection probabilities for both. Plot the simulated Type-I error rate/power as a function of \(n\) for each \(\theta_{true}\). Discuss the properties of these tests: does one seem preferable?
ggplot(data = simulated_power) +geom_line(aes(x = n, y = power_T1, color ="Test 1 (Max)")) +geom_line(aes(x = n, y = power_T2, color ="Test 2 (Mean)")) +geom_hline(yintercept =0.05, linetype =2) +facet_wrap(~true_theta, labeller =label_bquote(theta == .(true_theta))) +labs(x ="Sample Size",y ="Simulated Rejection Probability",color ="Test Statistic" ) +theme_classic(base_size =14)
We can see both tests fall on the 5% rejection line when theta = 1, regardless of sample size, which makes sense, it is our null hypothesis value. For all other thetas (1.1, 1.2, and 1.3), we can see that the Test 1, the maximum order stat does much better than test 2, the mean. i.e. Test 1 goes to 100% power much quicker when increasing sample size, regardless of the theta value. Also, the larger the value of theta, the quicker the power goes to 100%.