Two exercises:

1: Uniform probabilities

Here we investigate a case where under a null hypothesis we would expect data to be uniformly distributed among a number of available categories.

Do certain star signs favour success in life?

(we suspect not!)

In a survey of 256 chief executives of Fortune 500 companies, the number with each of the 12 star signs were found to be:

  • Aries: 23
  • Taurus: 20
  • Gemini: 18
  • Cancer: 23
  • Leo: 20
  • Virgo: 19
  • Libra: 18
  • Scorpio: 21
  • Sagittarius: 19
  • Capricorn: 22
  • Aquarius: 24
  • Pisces: 19

(256/12 = 21.3)

stars<-c(23,20,18,23,20,19,18,21,19,22,24,19)

Null Hypothesis

A suitable null hypothesis here is (delete as applicable): H0: The star sign plays a role/no role in success in life, hence the probability that a chief exec has any particular sign is (insert a value)

The Chi-Square test

We can use a chi-square test to tell us how likely it is that we would have got the data above if the null hypothesis is true.

chisq.test(stars)
## 
##  Chi-squared test for given probabilities
## 
## data:  stars
## X-squared = 2.2927, df = 11, p-value = 0.9972

Do we reject the null hypothesis or do we fail to reject it?

2: Non-uniform probabilities

Now we investigate a case where under a null hypothesis we would not expect data to be uniformly distributed among a number of available categories, but according to some other known distribution.

Are Selected Jurors Representative of the population from which they are drawn?

This exercise is adapted from: Diez, D. M., Cetinkaya-Rundel, M. and Barr, C. D. (2019) OpenIntro Statistics. 4th edn. Available at: https://www.openintro.org/stat/textbook.php?stat_book=os.

In a certain US locality, the numbers of jurors of different ethnicities are counted The numbers are found to be:

  • white: 205
  • coloured: 26
  • asian: 25
  • other: 19

The question is: Do these numbers fairly reflect the proportions of the population from each of these ethnicities?

Null hypothesis

A suitable null hypothesis would be…

As proportions of the whole, the numbers are:

  • white: 0.72
  • coloured: 0.07
  • asian: 0.12
  • other: 0.09

Solution

jurors<-c(205,26,25,19)
p0<-c(0.72,0.07,0.12,0.09)
chisq.test(x=jurors,p=p0) # we have to specify the p argument, because we do not have the default case of uniform probabilities fro each category, under the null
## 
##  Chi-squared test for given probabilities
## 
## data:  jurors
## X-squared = 5.8896, df = 3, p-value = 0.1171

Do we reject the null hypothesis or do we fail to reject it?