Basic chi-square test

Two exercises:

1: Uniform probabilities

Here we investigate a case where under a null hypothesis we would expect data to be uniformly distributed among a number of available categories.

Do certain star signs favour success in life?

(we suspect not!)

In a survey of 256 chief executives of Fortune 500 companies, the number with each of the 12 star signs were found to be:

Aries: 23
Taurus: 20
Gemini: 18
Cancer: 23
Leo: 20
Virgo: 19
Libra: 18
Scorpio: 21
Sagittarius: 19
Capricorn: 22
Aquarius: 24
Pisces: 19

(256/12 = 21.3)

stars<-c(23,20,18,23,20,19,18,21,19,22,24,19)

Null Hypothesis

A suitable null hypothesis here is (delete as applicable): H0: The star sign plays a role/no role in success in life, hence the probability that a chief exec has any particular sign is (insert a value)

The Chi-Square test

We can use a chi-square test to tell us how likely it is that we would have got the data above if the null hypothesis is true.

chisq.test(stars)

## 
##  Chi-squared test for given probabilities
## 
## data:  stars
## X-squared = 2.2927, df = 11, p-value = 0.9972

Do we reject the null hypothesis or do we fail to reject it?

2: Non-uniform probabilities

Now we investigate a case where under a null hypothesis we would not expect data to be uniformly distributed among a number of available categories, but according to some other known distribution.

Are Selected Jurors Representative of the population from which they are drawn?

This exercise is adapted from: Diez, D. M., Cetinkaya-Rundel, M. and Barr, C. D. (2019) OpenIntro Statistics. 4th edn. Available at: https://www.openintro.org/stat/textbook.php?stat_book=os.

In a certain US locality, the numbers of jurors of different ethnicities are counted The numbers are found to be:

white: 205
coloured: 26
asian: 25
other: 19

The question is: Do these numbers fairly reflect the proportions of the population from each of these ethnicities?

Null hypothesis

A suitable null hypothesis would be…

As proportions of the whole, the numbers are:

white: 0.72
coloured: 0.07
asian: 0.12
other: 0.09

Solution

jurors<-c(205,26,25,19)
p0<-c(0.72,0.07,0.12,0.09)
chisq.test(x=jurors,p=p0) # we have to specify the p argument, because we do not have the default case of uniform probabilities fro each category, under the null

## 
##  Chi-squared test for given probabilities
## 
## data:  jurors
## X-squared = 5.8896, df = 3, p-value = 0.1171

Do we reject the null hypothesis or do we fail to reject it?