Testing of Hypothesis

In many situations, we are called upon to make decision about a population characteristics. We may decide whether a new medicine is effective in curing disease or we made to compare a new brand of vaccine with more popular brand.

To reach a decision, it is useful to make speculations or guesses regarding the population parameter, which is called statistical hypotheses.

The process of testing a statistical hypothesis is one of the most powerful and popular tool in statistical analysis. With the advent of computers and statistical software, its use has become a very common chore in almost all aspects of scientific investigations.

Proper use of the scientific method will allow you to test one of these alternative positions through a sampling process. Remember you can choose only one to test. How do you decide?

Typical Steps in a Statistical Test of Hypothesis

  1. State the Problem
  2. Formulate the null and the appropriate alternative hypothesis

A statistical hypothesis is a guess or conjecture about the numerical value of some unknown population parameters. A null hypothesis is denoted by \(H_0\), such as

  1. \(H_0: \mu=120\)
  2. \(H_0: p=0.5\)
  3. \(H_0: \mu \le 50\)

When a hypothesis expresses a single value for the unknown parameter, liken in (a) and (b), the hypothesis is called simple hypothesis. Otherwise, like in (c), it is called a composite hypothesis.

  1. Specify the level of significance

This means to choose the probability of rejecting a true null hypothesis. It is denoted by \(\alpha\).

  1. Determine the appropriate test statistic
  2. Compute the actual value of the test statistic from the sample

6.1. Determine the critical values for the sampling distribution and appropriate level of significance

The set of all possible values of the test statistic is divided into two regions:

  • rejection/critical region
  • nonrejection region

How the range of possible values is divided into the rejection or acceptance region will depend upon the alternative hypothesis.

  • If the computed test statistic is in the interval defining the critical region, the null hypothesis is rejected.
  • If the computed test statistic is in the interval defining the nonrejection region, the null hypothesis is not rejected.

There are two types of statistical tests:

  1. One-sided Test is test where the critical region is in one direction only.

If \(\mu_0\) is some specific constant and the null hypothesis is of the form \(H_0: \mu=\mu_0\), then the critical region for one-sided alternative hypothesis, \(H_a: \mu>\mu_0\).

Similarly, for an alternative hypothesis \(H_a: \mu<\mu_0\), the critical region is on the left tail of the distribution.

  1. Two-sided Test has an alternative hypothesis of the form \(H_a: \mu\ne \mu_0\). Its critical region is on the two tails of the probability distribution.

A hypothesis can be also tested by constructing a \((1-\alpha)100\%\) confidence interval for the parameter of interest if the test is two-sided. If the hypothesized parameter value is contained in the confidence interval, then \(H_0\) is not rejected. Otherwise, if the hypothesized parameter value is not contained in the interval, \(H_0\) is rejected.

6.2. Determining the \(p\)-value of the test statistic

Alternatively, we can find the probability of the result or more extreme if \(H_0\) is true and use this so-called \(p\)-value to choose between the two hypothesis. Decision making will be as follows:

  • If the \(p\)-value is less than the significant level, \(p<\alpha\), the null hypothesis, \(H_0\), is rejected.
  • If the \(p\)-value is greater than the significant level, \(p>\alpha\), the null hypothesis, \(H_0\), is not rejected.
  1. Make a statistical decision
  • The null hypothesis is rejected if the computed value of the test statistic is within the critical region, otherwise \(H_0\) is not rejected.

  • The null hypothesis will be rejected if the \(p\)-value obtained is less than the level of significance \(\alpha\), \(p<\alpha\).

If the null hypothesis is rejected, it does not follow that \(H_0\) is true. It may be true but the evidence compatible with the null hypothesis is never conclusive. An appropriate conclusion will be to state that “there is no substantial evidence to reject the null hypothesis” rather than concluding the null hypothesis is true.

  1. State the appropriate conclusion

Testing a Hypothesis on the Population Mean

Problem

Suppose the manufacturer claims that the mean lifetime of a light bulb is more than 10,000 hours. In a sample of 30 light bulbs, it was found that they only last 9,900 hours on average. Assume the population standard deviation is 120 hours. At .05 significance level, can we reject the claim by the manufacturer?

Solution

The null hypothesis is that \(\mu ≥ 10000\).

The alternative hypothesis is that \(\mu < 10000\).

We begin with computing the test statistic.

xbar = 9900            # sample mean 
mu0 = 10000            # hypothesized value 
sigma = 120            # population standard deviation 
n = 30                 # sample size 
z =(xbar-mu0)/(sigma/sqrt(n)) 
z                      # test statistic 
## [1] -4.564355

Then the null hypothesis of the lower tail test is to be rejected if \(z ≤−z_{\alpha}\).

We then compute the critical value at .05 significance level.

alpha = .05 
z.alpha = qnorm(1-alpha) 
-z.alpha               # critical value 
## [1] -1.644854

The test statistic -4.5644 is less than the critical value of -1.6449. Hence, at .05 significance level, we reject the claim that mean lifetime of a light bulb is above 10,000 hours.

Alternative Solution

Instead of using the critical value, we apply the pnorm function to compute the lower tail p-value of the test statistic.

pval = pnorm(z) 
pval                   # lower tail p−value 
## [1] 2.505166e-06

As it turns out to be less than the .05 significance level, we reject the null hypothesis that \(\mu ≥ 10000\).

Problem

Suppose the food label on a cookie bag states that there is at most 2 grams of saturated fat in a single cookie. In a sample of 35 cookies, it is found that the mean amount of saturated fat per cookie is 2.1 grams. Assume that the population standard deviation is 0.25 grams. At .05 significance level, can we reject the claim on food label?

Solution

The null hypothesis is that \(\mu ≤ 2\). We begin with computing the test statistic.

xbar = 2.1             # sample mean 
mu0 = 2                # hypothesized value 
sigma = 0.25           # population standard deviation 
n = 35                 # sample size 
z = (xbar-mu0)/(sigma/sqrt(n)) 
z                      # test statistic 
## [1] 2.366432

Then the null hypothesis of the lower tail test is to be rejected if \(z ≥ z_{\alpha}\).

We then compute the critical value at .05 significance level.

alpha = .05 
z.alpha = qnorm(1-alpha) 
z.alpha                # critical value 
## [1] 1.644854

The test statistic 2.3664 is greater than the critical value of 1.6449. Hence, at .05 significance level, we reject the claim that there is at most 2 grams of saturated fat in a cookie.

Alternative Solution

Instead of using the critical value, we apply the pnorm function to compute the upper tail p-value of the test statistic.

pval = pnorm(z, lower.tail=FALSE) 
pval                   # upper tail p−value
## [1] 0.008980239

As it turns out to be less than the .05 significance level, we reject the null hypothesis that μ ≤ 2.

Problem

Let’s say we need to determine whether the average score of students is higher than 610 in the exam or not. We have the information that the standard deviation for students’ scores is 100. So, we collect the data of 32 students by using random samples and gets following data:

670, 730, 540, 670, 480, 800, 690, 560, 590, 620, 700, 660, 640, 710, 650, 490, 800, 600, 560, 700, 680, 550, 580, 700, 705, 690, 520, 650, 660, 790

Assume that the score follows a normal distribution. Test at 5% level of significance.

Null Hypothesis: the mean score is equal to 610 Alternate Hypothesis: the mean score is not equal to 610.

Calculate the test statistic using the z.test() function

#install.packages("BSDA")
library(BSDA)
## Warning: package 'BSDA' was built under R version 4.0.5
## Loading required package: lattice
## Warning: package 'lattice' was built under R version 4.0.5
## 
## Attaching package: 'BSDA'
## The following object is masked from 'package:datasets':
## 
##     Orange
dataset <- c(670,730,540,670,480,800,690,560,590,620,700,660,640,710,650,490,800 ,600,560,700,680,550,580,700,705,690,520,650,660,790)

#Perform one-sample z-test

z.test(x=dataset,mu=610,alternative = "two.sided",sigma.x = 100)
## 
##  One-sample z-Test
## 
## data:  dataset
## z = 1.9809, p-value = 0.0476
## alternative hypothesis: true mean is not equal to 610
## 95 percent confidence interval:
##  610.3828 681.9505
## sample estimates:
## mean of x 
##  646.1667

Since the p-value[0.0476] is less than the level of significance 0.05, we reject the null hypothesis.

This means we have sufficient evidence to say that the mean score for the students is not equal to 610.

Problem

Suppose the mean weight of King Penguins found in an Antarctic colony last year was 15.4 kg. In a sample of 35 penguins same time this year in the same colony, the mean penguin weight is 14.6 kg. Assume the population standard deviation is 2.5 kg. At .05 significance level, can we reject the null hypothesis that the mean penguin weight does not differ from last year?

Solution

The null hypothesis is that μ = 15.4. We begin with computing the test statistic.

xbar = 14.6            # sample mean 
mu0 = 15.4             # hypothesized value 
sigma = 2.5            # population standard deviation 
n = 35                 # sample size 
z = (xbar-mu0)/(sigma/sqrt(n)) 
z                      # test statistic 
## [1] -1.893146

Then the null hypothesis of the two-tailed test is to be rejected if \(z ≤−z_{\alpha/2}\) or \(z ≥ z_{\alpha/2}\).

We then compute the critical values at .05 significance level.

alpha = .05 
z.half.alpha = qnorm(1-alpha/2) 
c(-z.half.alpha, z.half.alpha) 
## [1] -1.959964  1.959964

The test statistic -1.8931 lies between the critical values -1.9600 and 1.9600. Hence, at .05 significance level, we do not reject the null hypothesis that the mean penguin weight does not differ from last year.

Alternative Solution

Instead of using the critical value, we apply the pnorm function to compute the two-tailed p-value of the test statistic. It doubles the lower tail p-value as the sample mean is less than the hypothesized value.

pval = 2*pnorm(z)    # lower tail 
pval                   # two−tailed p−value 
## [1] 0.05833852

Since it turns out to be greater than the .05 significance level, we do not reject the null hypothesis that μ = 15.4.

Problem

A certain restaurant advertises that it puts 0.25 pound of beef in its burgers. A customer who frequents the restaurant thinks the burger actually contain less than 0.25 pound of beef. With permission from the owner, the customer selected a random sample of 60 burgers and found the mean and standard deviation to be 0.22 and 0.07, respectively.

\[\begin{align*} H_0:& \mu\ge 0.25\\ H_a:& \mu<0.25 \end{align*}\]

xbar = 0.22            # sample mean 
mu0 = 0.25             # hypothesized value 
sigma = 0.07            # population standard deviation 
n = 60                 # sample size 
z = (xbar-mu0)/(sigma/sqrt(n)) 
z                      # test statistic 
## [1] -3.3197

Then the null hypothesis of the lower tail test is to be rejected if \(z ≤−z_{\alpha}\).

We then compute the critical value at .01 significance level.

alpha = .01 
z.alpha = qnorm(1-alpha) 
-z.alpha               # critical value 
## [1] -2.326348

Since \(z=-3.3197<-2.33=z-\alpha\), \(H_0\) is rejected.At \(\alpha=0.01\) level of significance, the customer has sufficient evidence to claim that the mean amount of beef in burgers the restaurant makes is less than 0.25 lbs.

Compute for the \(p\)-value. Will you reject \(H_0\) in (1) at 0.01 level of significance?

pval = pnorm(z) 
pval                   # lower tail p−value 
## [1] 0.0004505711

Since the \(p\)-value is 0.0005 which is less than the level of significance \(\alpha=0.01\), \(H_0\) is rejected. This further confirms the previous results.

Problem

Suppose the manufacturer claims that the mean lifetime of a light bulb is more than 10,000 hours. In a sample of 30 light bulbs, it was found that they only last 9,900 hours on average. Assume the sample standard deviation is 125 hours. At .05 significance level, can we reject the claim by the manufacturer?

Solution

The null hypothesis is that μ ≥ 10000. We begin with computing the test statistic.

xbar = 9900            # sample mean 
mu0 = 10000            # hypothesized value 
s = 125                # sample standard deviation 
n = 30                 # sample size 
t = (xbar-mu0)/(s/sqrt(n)) 
t                      # test statistic 
## [1] -4.38178

Then the null hypothesis of the lower tail test is to be rejected if \(t ≤−t_{\alpha}\).

We then compute the critical value at .05 significance level.

alpha = .05 
t.alpha = qt(1-alpha, df=n-1) 
-t.alpha               # critical value 
## [1] -1.699127

The test statistic -4.3818 is less than the critical value of -1.6991. Hence, at .05 significance level, we can reject the claim that mean lifetime of a light bulb is above 10,000 hours.

Alternative Solution

Instead of using the critical value, we apply the pt function to compute the lower tail p-value of the test statistic.

pval = pt(t, df=n-1) 
pval                   # lower tail p−value
## [1] 7.035026e-05

As it turns out to be less than the .05 significance level, we reject the null hypothesis that μ ≥ 10000.

Problem

Suppose the food label on a cookie bag states that there is at most 2 grams of saturated fat in a single cookie. In a sample of 35 cookies, it is found that the mean amount of saturated fat per cookie is 2.1 grams. Assume that the sample standard deviation is 0.3 gram. At .05 significance level, can we reject the claim on food label?

Solution

The null hypothesis is that μ ≤ 2. We begin with computing the test statistic.

xbar = 2.1             # sample mean 
mu0 = 2                # hypothesized value 
s = 0.3                # sample standard deviation 
n = 35                 # sample size 
t = (xbar-mu0)/(s/sqrt(n)) 
t                      # test statistic 
## [1] 1.972027

Then the null hypothesis of the upper tail test is to be rejected if \(t ≥ t_{\alpha}\).

We then compute the critical value at .05 significance level.

alpha = .05 
t.alpha = qt(1-alpha, df=n-1) 
t.alpha                # critical value 
## [1] 1.690924

The test statistic 1.9720 is greater than the critical value of 1.6991. Hence, at .05 significance level, we can reject the claim that there is at most 2 grams of saturated fat in a cookie.

Alternative Solution

Instead of using the critical value, we apply the pt function to compute the upper tail p-value of the test statistic.

pval = pt(t, df=n-1, lower.tail=FALSE) 
pval                   # upper tail p−value 
## [1] 0.02839295

As it turns out to be less than the .05 significance level, we reject the null hypothesis that μ ≤ 2.

Problem

Suppose we want to know whether or not the mean weight of a certain species of some turtle is equal to 310 pounds. We go out and collect a simple random sample of turtles with the following weights:

Weights: 300, 315, 320, 311, 314, 309, 300, 308, 305, 303, 305, 301, 303

The following code shows how to perform this one sample t-test in R:

#define vector of turtle weights
turtle_weights <- c(300, 315, 320, 311, 314, 309, 300, 308, 305, 303, 305, 301, 303)

#perform one sample t-test
t.test(x = turtle_weights, mu = 310)
## 
##  One Sample t-test
## 
## data:  turtle_weights
## t = -1.5848, df = 12, p-value = 0.139
## alternative hypothesis: true mean is not equal to 310
## 95 percent confidence interval:
##  303.4236 311.0379
## sample estimates:
## mean of x 
##  307.2308

Since the p-value of the test (0.139) is not less than .05, we fail to reject the null hypothesis.

This means we do not have sufficient evidence to say that the mean weight of this species of turtle is different from 310 pounds.

Problem

Suppose the mean weight of King Penguins found in an Antarctic colony last year was 15.4 kg. In a sample of 35 penguins same time this year in the same colony, the mean penguin weight is 14.6 kg. Assume the sample standard deviation is 2.5 kg. At .05 significance level, can we reject the null hypothesis that the mean penguin weight does not differ from last year?

Solution

The null hypothesis is that μ = 15.4. We begin with computing the test statistic.

xbar = 14.6            # sample mean 
mu0 = 15.4             # hypothesized value 
s = 2.5                # sample standard deviation 
n = 35                 # sample size 
t = (xbar-mu0)/(s/sqrt(n)) 
t                      # test statistic
## [1] -1.893146

Then the null hypothesis of the two-tailed test is to be rejected if \(t ≤−t_{\alpha/2}\) or \(t ≥ t_{\alpha/2}\).

We then compute the critical values at .05 significance level.

alpha = .05 
t.half.alpha = qt(1-alpha/2, df=n-1) 
c(-t.half.alpha, t.half.alpha) 
## [1] -2.032245  2.032245

The test statistic -1.8931 lies between the critical values -2.0322, and 2.0322. Hence, at .05 significance level, we do not reject the null hypothesis that the mean penguin weight does not differ from last year.

Alternative Solution

Instead of using the critical value, we apply the pt function to compute the two-tailed p-value of the test statistic. It doubles the lower tail p-value as the sample mean is less than the hypothesized value.

pval = 2*pt(t, df=n-1)  # lower tail 
pval                      # two−tailed p−value 
## [1] 0.06687552

Since it turns out to be greater than the .05 significance level, we do not reject the null hypothesis that μ = 15.4.

Testing a Hypothesis on the Two Population Means

Two data samples are independent if they come from unrelated populations and the samples does not affect each other. Here, we assume that the data populations follow the normal distribution. Using the unpaired t-test, we can obtain an interval estimate of the difference between two population means.

Example

In the data frame column mpg of the data set mtcars, there are gas mileage data of various 1974 U.S. automobiles.

 mtcars$mpg 
##  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
## [16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
## [31] 15.0 21.4

Meanwhile, another data column in mtcars, named am, indicates the transmission type of the automobile model (0 = automatic, 1 = manual).

mtcars$am 
##  [1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1

In particular, the gas mileage for manual and automatic transmissions are two independent data populations.

Problem

Assuming that the data in mtcars follows the normal distribution, find the 95% confidence interval estimate of the difference between the mean gas mileage of manual and automatic transmissions.

Solution

As mentioned in the tutorial Data Frame Row Slice, the gas mileage for automatic transmission can be listed as follows:

L = mtcars$am == 0 
mpg.auto = mtcars[L,]$mpg 
mpg.auto                    # automatic transmission mileage
##  [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7 21.5
## [16] 15.5 15.2 13.3 19.2

By applying the negation of L, we can find the gas mileage for manual transmission.

mpg.manual = mtcars[!L,]$mpg 
mpg.manual                  # manual transmission mileage 
##  [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0 21.4

We can now apply the t.test function to compute the difference in means of the two sample data.

t.test(mpg.auto, mpg.manual) 
## 
##  Welch Two Sample t-test
## 
## data:  mpg.auto and mpg.manual
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean of x mean of y 
##  17.14737  24.39231

In mtcars, the mean mileage of automatic transmission is 17.147 mpg and the manual transmission is 24.392 mpg. The 95% confidence interval of the difference in mean gas mileage is between 3.2097 and 11.2802 mpg.

Alternative Solution

We can model the response variable mtcars$mpg by the predictor mtcars$am, and then apply the t.test function to estimate the difference of the population means.

t.test(mpg ~ am, data=mtcars)
## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean in group 0 mean in group 1 
##        17.14737        24.39231

Note

Some textbooks truncate down the degree of freedom to an integer, and the result would differ from the t.test.

Matched Samples

Two data samples are matched if they come from repeated observations of the same subject. Here, we assume that the data populations follow the normal distribution. Using the paired t-test, we can obtain an interval estimate of the difference of the population means.

Example

In the built-in data set named immer, the barley yield in years 1931 and 1932 of the same field are recorded. The yield data are presented in the data frame columns Y1 and Y2.

library(MASS)         # load the MASS package 
## Warning: package 'MASS' was built under R version 4.0.5
head(immer) 
##   Loc Var    Y1    Y2
## 1  UF   M  81.0  80.7
## 2  UF   S 105.4  82.3
## 3  UF   V 119.7  80.4
## 4  UF   T 109.7  87.2
## 5  UF   P  98.3  84.2
## 6   W   M 146.6 100.4

Problem

Assuming that the data in immer follows the normal distribution, find the 95% confidence interval estimate of the difference between the mean barley yields between years 1931 and 1932.

Solution

We apply the t.test function to compute the difference in means of the matched samples. As it is a paired test, we set the “paired” argument as TRUE.

t.test(immer$Y1, immer$Y2, paired=TRUE) 
## 
##  Paired t-test
## 
## data:  immer$Y1 and immer$Y2
## t = 3.324, df = 29, p-value = 0.002413
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   6.121954 25.704713
## sample estimates:
## mean of the differences 
##                15.91333

Between years 1931 and 1932 in the data set immer, the 95% confidence interval of the difference in means of the barley yields is the interval between 6.122 and 25.705.

Testing a Hypothesis on the Population Proportion

Problem

Suppose 60% of citizens voted in last election. 85 out of 148 people in a telephone survey said that they voted in current election. At 0.5 significance level, can we reject the null hypothesis that the proportion of voters in the population is above 60% this year?

Solution

The null hypothesis is that p ≥ 0.6. We begin with computing the test statistic.

pbar = 85/148          # sample proportion 
p0 = .6                # hypothesized value 
n = 148                # sample size 
z = (pbar-p0)/sqrt(p0*(1-p0)/n) 
z                      # test statistic 
## [1] -0.6375983

Then the null hypothesis of the lower tail test is to be rejected if \(z ≤−z_{\alpha}\).

We then compute the critical value at .05 significance level.

 alpha = .05 
z.alpha = qnorm(1-alpha) 
-z.alpha               # critical value 
## [1] -1.644854

The test statistic -0.6376 is not less than the critical value of -1.6449. Hence, at .05 significance level, we do not reject the null hypothesis that the proportion of voters in the population is above 60% this year.

Alternative Solution 1

Instead of using the critical value, we apply the pnorm function to compute the lower tail p-value of the test statistic.

pval = pnorm(z) 
pval                   # lower tail p−value 
## [1] 0.2618676

As it turns out to be greater than the .05 significance level, we do not reject the null hypothesis that p ≥ 0.6.

Alternative Solution 2

We apply the prop.test function to compute the p-value directly. The Yates continuity correction is disabled for pedagogical reasons.

prop.test(85, 148, p=.6, alt="less", correct=FALSE) 
## 
##  1-sample proportions test without continuity correction
## 
## data:  85 out of 148, null probability 0.6
## X-squared = 0.40653, df = 1, p-value = 0.2619
## alternative hypothesis: true p is less than 0.6
## 95 percent confidence interval:
##  0.0000000 0.6392527
## sample estimates:
##         p 
## 0.5743243

Problem

Suppose that 12% of apples harvested in an orchard last year was rotten. 30 out of 214 apples in a harvest sample this year turns out to be rotten. At .05 significance level, can we reject the null hypothesis that the proportion of rotten apples in harvest stays below 12% this year?

Solution

The null hypothesis is that p ≤ 0.12. We begin with computing the test statistic.

pbar= 30/214          # sample proportion 
p0= .12               # hypothesized value 
n = 214                # sample size 
z = (pbar-p0)/sqrt(p0*(1-p0)/n) 
z                      # test statistic 
## [1] 0.908751

Then the null hypothesis of the upper tail test is to be rejected if \(z ≥ z_{\alpha}\).

We then compute the critical value at .05 significance level.

alpha = .05 
z.alpha = qnorm(1-alpha) 
z.alpha                # critical value 
## [1] 1.644854

The test statistic 0.90875 is not greater than the critical value of 1.6449. Hence, at .05 significance level, we do not reject the null hypothesis that the proportion of rotten apples in harvest stays below 12% this year.

Alternative Solution 1

Instead of using the critical value, we apply the pnorm function to compute the upper tail p-value of the test statistic.

pval = pnorm(z, lower.tail=FALSE) 
pval                   # upper tail p−value 
## [1] 0.1817408

As it turns out to be greater than the .05 significance level, we do not reject the null hypothesis that p ≤ 0.12.

Alternative Solution 2

We apply the prop.test function to compute the p-value directly. The Yates continuity correction is disabled for pedagogical reasons.

prop.test(30, 214, p=.12, alt="greater", correct=FALSE)
## 
##  1-sample proportions test without continuity correction
## 
## data:  30 out of 214, null probability 0.12
## X-squared = 0.82583, df = 1, p-value = 0.1817
## alternative hypothesis: true p is greater than 0.12
## 95 percent confidence interval:
##  0.1056274 1.0000000
## sample estimates:
##         p 
## 0.1401869

Problem

Suppose a coin toss turns up 12 heads out of 20 trials. At .05 significance level, can one reject the null hypothesis that the coin toss is fair?

Solution

The null hypothesis is that p = 0.5. We begin with computing the test statistic.

pbar = 12/20           # sample proportion 
p0 = .5                # hypothesized value 
n = 20                 # sample size 
z = (pbar-p0)/sqrt(p0*(1-p0)/n) 
z                      # test statistic 
## [1] 0.8944272

Then the null hypothesis of the two-tailed test is to be rejected if \(z ≤−z_{\alpha/2}\) or \(z ≥ z_{\alpha/2\).

We then compute the critical values at .05 significance level.

alpha = .05 
z.half.alpha = qnorm(1-alpha/2) 
c(-z.half.alpha, z.half.alpha)
## [1] -1.959964  1.959964

The test statistic 0.89443 lies between the critical values -1.9600 and 1.9600. Hence, at .05 significance level, we do not reject the null hypothesis that the coin toss is fair.

Alternative Solution 1

Instead of using the critical value, we apply the pnorm function to compute the two-tailed p-value of the test statistic. It doubles the upper tail p-value as the sample proportion is greater than the hypothesized value.

 pval = 2*pnorm(z, lower.tail=FALSE)  # upper tail 
pval                   # two−tailed p−value 
## [1] 0.3710934

Since it turns out to be greater than the .05 significance level, we do not reject the null hypothesis that p = 0.5.

Alternative Solution 2

We apply the prop.test function to compute the p-value directly. The Yates continuity correction is disabled for pedagogical reasons.

prop.test(12, 20, p=0.5, correct=FALSE) 
## 
##  1-sample proportions test without continuity correction
## 
## data:  12 out of 20, null probability 0.5
## X-squared = 0.8, df = 1, p-value = 0.3711
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.3865815 0.7811935
## sample estimates:
##   p 
## 0.6

Comparison of Two Population Proportions

A survey conducted in two distinct populations will produce different results. It is often necessary to compare the survey response proportion between the two populations. Here, we assume that the data populations follow the normal distribution.

Example

In the built-in data set named quine, children from an Australian town is classified by ethnic background, gender, age, learning status and the number of days absent from school.

library(MASS)         # load the MASS package 
head(quine) 
##   Eth Sex Age Lrn Days
## 1   A   M  F0  SL    2
## 2   A   M  F0  SL   11
## 3   A   M  F0  SL   14
## 4   A   M  F0  AL    5
## 5   A   M  F0  AL    5
## 6   A   M  F0  AL   13

In effect, the data frame column Eth indicates whether the student is Aboriginal or Not (“A” or “N”), and the column Sex indicates Male or Female (“M” or “F”).

In R, we can tally the student ethnicity against the gender with the table function. As the result shows, within the Aboriginal student population, 38 students are female. Whereas within the Non-Aboriginal student population, 42 are female.

table(quine$Eth, quine$Sex) 
##    
##      F  M
##   A 38 31
##   N 42 35

Problem

Assuming that the data in quine follows the normal distribution, find the 95% confidence interval estimate of the difference between the female proportion of Aboriginal students and the female proportion of Non-Aboriginal students, each within their own ethnic group.

Solution

We apply the prop.test function to compute the difference in female proportions. The Yates’s continuity correction is disabled for pedagogical reasons.

prop.test(table(quine$Eth, quine$Sex), correct=FALSE) 
## 
##  2-sample test for equality of proportions without continuity
##  correction
## 
## data:  table(quine$Eth, quine$Sex)
## X-squared = 0.0040803, df = 1, p-value = 0.9491
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.1564218  0.1669620
## sample estimates:
##    prop 1    prop 2 
## 0.5507246 0.5454545

The 95% confidence interval estimate of the difference between the female proportion of Aboriginal students and the female proportion of Non-Aboriginal students is between -15.6% and 16.7%.