Type 1 & Type 2 error

Our hypothesis is H0. This is the accepted wisdom. We are proposing an alternative, HA, that H0 is not true. If it is incorrect, that is HA.
False alarm: If you incorrectly reject H0 when it is true, that is Type 1 error. If you incorrectly accept H0 when it is false, that is Type 2 error.

There is a trade-off between the two. Need to set a threshold for acceptable risk of Type 1 error. This is called the significance– alpha. So significance is the threshold for acceptable risk of rejecting a correct H0. Decision to accept or reject is based on the significance level. Note: you can never entirely say that H0 is correct– all you can say is that you don’t have the evidence to disprove H0. We can only reject or no reject H0. Not the same as accepting it. We don’t know that it’s right– just don’t have evidence to say it is wrong.

Hypothesis Testing

Goal of Hypothesis Testing

Method of Hypothesis Testing

Assuming H0 is true, deduce what the sample result should look like. Two options:

The rejection zone is an interval within the sampling distribution which if we fall within the rejection zone, we reject H0. Depending on how you specify H0 and HA, you might have a two-tailed or a one-tailed (left-tailed or right-tailed test)

Two approaches:

Example: Glow Toothpaste

Designed to fill tubes of toothpaste with a mean weight of 6 ounces. Population standard deviation = sigma = 0.2 (estimated from previous studies) Take a sample of 40 tubes of toothpaste for quality control. If the sample results show that we are overfilling or underfilling, need to stop prod line & adjust machines.

Develop a hypothesis test. The management has decided to set the significance level at 0.05.

H0: mean = mu = 6 = everything is fine HA: mean = mu != 6 = we are overfilling or underfilling

We’ve accepted a signficance level= maximum tolerance of Type 1 error= disprove H0 when H0 is true= stop production when everything is fine = 0.05 = 5/100 times if you do this test you will get a false alarm

Suppose that the sample of 40 gives a sample mean of 6.07. Should the line be stopped?

What is the sampling distribution? It is the sampling distribution assuming that H0 is correct Since n > 30, we can assume a roughly normal distribution n = 40 mean = 6 std deviation = sigma/sqrt(n)

a: critical value approach: our sample mean is 60.07 is 0.07 big enough that we can disprove H0? Or is it small enough that we fail to disprove H0?

Let’s solve it with the critical value approach this is a two-tailed approach– could be too full or not full enough spread the risk on both sides– .0025 on the left or 0.025 on the right– splitting our 5% in half what is the cutoff for the rejection zone? calculate the z value

z_critical = qnorm(0.025, lower.tail=FALSE)

z_sample = (6.07 - 6)/ (0.2/sqrt(40))

print(paste("sample z value is", round(z_sample, digits=2), "at a 0.05 level of significance the critical z value is", round(z_critical, digits=2)))
## [1] "sample z value is 2.21 at a 0.05 level of significance the critical z value is 1.96"
print("given a significance level of 0.05, we have exceeded our critical value so we reject H0 = there is a problem = we need to stop the production line")
## [1] "given a significance level of 0.05, we have exceeded our critical value so we reject H0 = there is a problem = we need to stop the production line"

if our sample result was over 1.96 or under 1.96 we would be in the rejection zone

b: p value approach: what is the rejection zone? area where our sample mean is very unlikely to fall then see if 0.07 is within the rejection zone

p value means if H0 is correct, how likely is it that we would observe a result this extreme?

p = 2*pnorm(z_sample, lower.tail=F)
# we multiplied by 2 because it is a 2-sided test
print(paste("p value is", round(p, digits=3)))
## [1] "p value is 0.027"
print(paste("since p value < our significance level of 0.05, we reject H0"))
## [1] "since p value < our significance level of 0.05, we reject H0"
print(paste("our p value is lower < our signficance level of 0.05 so we reject H0 and the production line needs to be stopped"))
## [1] "our p value is lower < our signficance level of 0.05 so we reject H0 and the production line needs to be stopped"
print(paste("if we stop the line, there is a 0.027 probablity that it was a false alarm"))
## [1] "if we stop the line, there is a 0.027 probablity that it was a false alarm"
print("since that is less then our accepted risk of 5%, stop the line")
## [1] "since that is less then our accepted risk of 5%, stop the line"

a shortcut

library(TeachingDemos)
# sample mean = 6.07, H0 mean is 6.0, population std deviation = 0.2, n=40
# since this a 2-sided test can accept the default 2-sided
z.test(6.07, mu=6, sd=0.2, n=40)
## 
##  One Sample z-test
## 
## data:  6.07
## z = 2.2136, n = 40.000000, Std. Dev. = 0.200000, Std. Dev. of the
## sample mean = 0.031623, p-value = 0.02686
## alternative hypothesis: true mean is not equal to 6
## 95 percent confidence interval:
##  6.00802 6.13198
## sample estimates:
## mean of 6.07 
##         6.07

note that p-value = 0.02686 = the probabilty that this is a false alarm < our accepted risk of a false alarm so stop the production line

Example 2: Glow Toothpaste Overfilling

What if we are only concerned with overfilling?

H0: mu = mean = 6 HA: mu = mean > 6

this is an upper-tail test– we are only concerned with overfilling

we do not split the risk– all 0.05 risk is on the right side of the curve

method 1: critical approach

critical_z = round(qnorm(0.05, lower.tail = FALSE), digits=3)
z = round((6.07 - 6) / (0.2/sqrt(40)), digits=3)
print(paste("Our sample z is", z, "and our critical value is", critical_z))
## [1] "Our sample z is 2.214 and our critical value is 1.645"
print("given our significance level of 5%, we reject H0 because our sample z exceeds our critical value z")
## [1] "given our significance level of 5%, we reject H0 because our sample z exceeds our critical value z"

method 2: p-value approach

remember: if the true mean is 6, what is the probabilty that we will have a sample mean as large as or larger then this value again, only doing right side

z = (6.07-6)/(0.2/sqrt(40))
p=pnorm(z, lower.tail=F)
print(paste('p value=', round(p, digits=4)))
## [1] "p value= 0.0134"
print(paste("since our p value", round(p, digits=4), "our chance of making a type 1 error = false alarm = is less then our significance of 5% and we should stop the production line" ))
## [1] "since our p value 0.0134 our chance of making a type 1 error = false alarm = is less then our significance of 5% and we should stop the production line"
z.test(6.07, 6, sd=0.2, n=40, alternative = "greater")
## 
##  One Sample z-test
## 
## data:  6.07
## z = 2.2136, n = 40.000000, Std. Dev. = 0.200000, Std. Dev. of the
## sample mean = 0.031623, p-value = 0.01343
## alternative hypothesis: true mean is greater than 6
## 95 percent confidence interval:
##  6.017985      Inf
## sample estimates:
## mean of 6.07 
##         6.07

What if we don’t know the population standard deviation?

We can no longer use a normal curve: we use a t distribution which means we need the variable degrees of freedom = n - 1

Example 3: US Household Spending

Avg US household spends $90 per day. in corning NY, sample of 30 households sample mean = 14.50 sample std deviation = 14.50

test hypothesis: H0: mean = 90 HA: mean != 90 2-sided test significance level = standard = 5%

method 1: critical value approach

t = round((84.50 - 90) / (14.50/sqrt(30)), digits=3)
tcrit = round(qt(0.025,29), digits=3)
print(paste("sample t value is", t, ", critical t value is", tcrit))
## [1] "sample t value is -2.078 , critical t value is -2.045"
print("our t value is lower then our critical t value")
## [1] "our t value is lower then our critical t value"
print("Reject H0 at a 5% level of significance.  The populaton mean in Corning NY differs from the US mean at a 5% level of significance")
## [1] "Reject H0 at a 5% level of significance.  The populaton mean in Corning NY differs from the US mean at a 5% level of significance"

method 2: p value approach

# not sure that i got this right

t = (84.50 - 90)/(14.50/sqrt(30))
p = 2 * pt(t, 29) # multiply by 2 because it's 2 sided

print("reject H0 at a 5% level of singificance. The population mean in Corning NY differs from the US mean at a 5% level of significance")
## [1] "reject H0 at a 5% level of singificance. The population mean in Corning NY differs from the US mean at a 5% level of significance"

Example 4: Coca-Cola

Coca-Cola reported mean per capita annual sales in US of 423 8 ounce servings Question: is the consumption of Coca-Cola higher in Atlanta, the comapny’s corporate headquarters? A sample of 36 folks showed 460.4 8-ounce servicngs with a sample standard deviation of s=101.9 Using 5% significance, do the sample results support the conclusion that the mean in Atlanta is higher?

hypothesis test:

H0: mean <= 423 HA: mean > 423

we don’t know the population standard deviation, so use the t distribution

upper tail test

method 1: critical value method

xbar = 460.4
mean = 423
n = 36
dof = n-1
s = 101.9

t = round((xbar - mean) / (s/sqrt(n)), digits=3)
tcrit = round(qt(0.95,dof), digits=3)
# could also say qt(0.05, dof, lower.tail=F)
print(paste("the sample t value is", t, "which is larger then the critical t value of", 1.69))
## [1] "the sample t value is 2.202 which is larger then the critical t value of 1.69"
print(paste("reject H0 at 5% significance level = accept HA"))
## [1] "reject H0 at 5% significance level = accept HA"
print(paste("At the 5% significance level, the sample results supports the conclusion that Atlanta residents have a higher mean consumption of Coca-Cola beverages"))
## [1] "At the 5% significance level, the sample results supports the conclusion that Atlanta residents have a higher mean consumption of Coca-Cola beverages"

method 2: p value approach

xbar = 460.4
mean = 423
n = 36
dof = n-1
s = 101.9
t = (xbar - mean)/(s/sqrt(n))
p = pt(t, dof, lower.tail=F)
print(paste("our p value of", p, "is less then our significance level of 5% so we reject"))
## [1] "our p value of 0.0171673742785304 is less then our significance level of 5% so we reject"
print("need to print the actual conclusion!")
## [1] "need to print the actual conclusion!"

steps for test statistics:

  1. determine null and alt hypothees:

if the inital assumption is being made, H0 is the assumption if the intial assumption is not given, then what you want to prove would be HA

1.5. what kind of test? two-sided, upper or lower?

  1. do you have the population std deviation? tells you whether to calculate the critical z values or t values

  2. compare z value to critical z or t value to critical t

steps for p value

  1. detemine hypotheses

1.5. what kind of test? two-sided, upper or lower?

  1. t or z

  2. determine your p value

Population Proportion

You do not need to determine whether or not to use normal distribution– if the sample is large enought we always approximate with normal

Example: General Mills

at the close of 2000, reported General Mills market leader for cereals with a share of 27.6%

because this is a yes/no question, this is a proportion

a recent random sample of 200 found 60 preferred

Hypotheses:

H0: P = 27.6% HA: P != 27.6%

n = 200
yes = 60
pbar = yes/ n
p0 = .276
z = round((pbar - 0.276)/sqrt(0.276 * (1-0.276)/200), digits = 3)
zcrit = round(round(qnorm(0.025, lower.tail=F), digits=4), digits = 3)
print(paste("our critical value is", zcrit, "and our sample z value is", z))
## [1] "our critical value is 1.96 and our sample z value is 0.759"
print("conclusion: do not reject h0 at a 5% level of significance.  there is not sufficient evident to support tat the proportion has changed from the 2000 market share")
## [1] "conclusion: do not reject h0 at a 5% level of significance.  there is not sufficient evident to support tat the proportion has changed from the 2000 market share"

p value approach

pbar = 60/ 200
z = (pbar - 0.276)/ sqrt(0.276 * (1-0.276)/200)
p = round(2 * pnorm(z, lower.tail=F),digits=3)
print(paste("if you reject h0, the probability of type 1 error =  that you are rejecting a correct hypothesis is", p))
## [1] "if you reject h0, the probability of type 1 error =  that you are rejecting a correct hypothesis is 0.448"

Example: Superbowl

i got confused about this– need to check her solution

before the 2003 super bowl, abc predicted 22% of superbowl audience would be interested in watching 1 of their upcoming tv shows. abc ran commercials during the super bowl. the day after the superbowl, sampled 1532 viewers who saw the commercials and found that 414 said that they would watch one of the shows.

at significance of 1%, determine whether the intent to watch these shows significantly increased after seeing the commercials.

hypotheses: h0: p <= 22% ha: p > 22%

upper-tail problem