Nabisco claims 60% of the bags of chocolate chip cookies they manufacture have more then 1150 chocolate chips in them. Are they lying?
Our strategy is to gather data and then determine if the data is inconsistent with Nabiscos’s claim.
So the first thing to imagine in the population of all bags of chocolate chip cookies. At this point we have no reason to believe Nabisco is lying. So you might have a picture like this.
p <- .6
q <- 1- p
n <- 50
(n*p*q)
## [1] 12
(phat <- 24/50)
## [1] 0.48
se <-sqrt(p*q/n)
z <-(phat -.6)/se
yes <- rep("YES",round(p*10000))
no <- rep("NO",10000 - round(p*10000))
c <- c(yes,no)
counts <-table(c)/10000
barplot(counts,main = "Population of bags of cookies with more then 1150 chips",
xlab = "Yes or No",
col = c("red","darkgreen")
)

From a random sample of 50 bags we find that 24 of them have more then 1150 chips.
Statistical theory allows us to know the probability of all possible outcomes to our experiment.
Statistical theory says:
\(\Large\hat{P_n} \sim \mathcal{N} (p,\sqrt{p(1-p)/n)}\)
Caveat is that \(np(1-p)\ge10\)
We can see the population has a p = \(0.6\) and our npq = \(12\).
shadenorm(mu = p, sig = sqrt(p*q/n), below = -1, col = "blue", dens = 0)

Armed with this information we are now able to determine if our data is likely or unlikely under the assumption that Nabisco is telling the truth.
Our logic will be as follows.
If this data is unlikely that is evidence that Nabisco is lying.
shadenorm(mu = p, sig = sqrt(p*q/n), below = phat, col = "blue", dens = 200)

This shaded area is the probability of obtaining a sample proportion less then 24/50 under the assumption Nabsisco is telling the truth.
Our strategy of obtaining this probablity is to find the appropriate Z value such the area to left of it is exactly the area we are looking for.
\(z = \Huge\frac{\hat{p}-p}{\sqrt{p(1-p)/n)}} = -1.73\)
shadenorm(mu = 0, sig = 1, below = z, col = "blue", dens = 200)

pnorm(z,mean = 0, sd = 1)
## [1] 0.04163226
Backwards problem.
Consumer Reports magazine says that if the probability of your data is less then .05 that is evidence that Nabisco is lying.
According to Consumer Reports what is the maximum proportion you could observe and still claim they are lying?Assume a sample size of 50.
Strategy is to find the Z value that corresponds to .05. Then convert that Z value to a Phat.
From the table the corresponding Z value is - 1.64.
\(z = \Huge\frac{\hat{p}-p}{\sqrt{p(1-p)/n)}} = -1.64\)
Plugging in .6 for p and 50 for n and solving for \(\hat{p}=\) \(0.48\)
Concept: A sample proportion of .46 could be more likely then a sample proportion of .48. The n matters.

pnorm(phat,mean = .6, sd = se )
## [1] 0.04163226
pnorm(phat-.02,mean = .6, sd = .099 )
## [1] 0.07866019