May 27, 2016

Abstract

A study was conducted using Leigh High School as the population of interest to determine the following:

  • Which pet(dog or cat) do Leigh High School students prefer?
  • Is there a significant difference in the proportion of male and female students who prefer dogs?
  • For those students who selected dogs as their preferred pet, determine which size of dog is preferred?
  • Is there a significant difference in the proportion of male and female students who prefer large dogs?

Two experiments were conducted which collected data from Leigh High School students as follows:

  • Preference for dogs
  • Preference for large dogs

Each of the experiments are explained in kind in the sections below.

Hypothesis - Experiment 1 - Preference for Dogs

Null Hypothesis

The proportion of male students, \(p^{dog}_{male}\), who prefer dogs is the same as the proportion of female students, \(p^{dog}_{female}\), who prefer dogs:

\[H{0}: p^{dog}_{male} = p^{dog}_{female}\]

Alternative Hypothesis

The proportion of male students, \(p^{dog}_{male}\), who prefer dogs is not the same as the proportion of female students, \(p^{dog}_{female}\) who, prefer dogs:

\[H{a}: p^{dog}_{male} \neq p^{dog}_{female}\]

Hypothesis - Experiment 2 - Preference for Large Dogs

Null Hypothesis

The proportion of male students that prefer large dogs, \(p^{large}_{male}\), is the same as the proportion of female students that prefer large dogs, \(p^{large}_{female}\)

\[H{0}: p^{large}_{males} = p^{large}_{females}\]

Alternative Hypothesis

The proportion of male students that prefer large dogs, \(p^{large}_{male}\), is not the same as the proportion of female students that prefer large dogs, \(p^{large}_{female}\)

\[H{a}: p^{large}_{males} \neq p^{large}_{females}\]

Description of Study

  • Four english classrooms were selected at random to collect dog or cat preference: Duffy, Gill, Leah-Martin, Moseley.
  • For each of the visited class rooms a tally was recorded of those:
    • who preferred dogs versus cats for the two populations: males, females.
    • who preferred large dogs versus small dogs for the two populations: males, females.
  • Data collection consisted of a show of hands of those people who preferred dogs and those who preferred cats
  • The numerical data then as transcribed into CSV file to make it machine readable for computations
  • Data was loaded in R statistical environment to perform repeatable computations.
  • R Markdown was used to render a PDF and then converted to a PowerPoint.

Improvements for the Study

  • Rather than collect a tally for each combination in each of the classes, have each of the students write down their preferences individually so that there were more granular collection of data.
  • This could avoid incorrect counting of raised hands.

Experiment #1 - Collected Data

The following data was collected for dog preference

teacher dog cat gender
moseley 12 0 male
moseley 9 2 female
duffy 17 3 male
duffy 7 1 female
gill 11 3 male
gill 13 1 female
leah-martin 9 0 male
leah-martin 8 6 female

Experiment #1 - Summarize Data

Summarize our data for males and females preference for dogs

pref_male <- pref[pref$gender=='male',]
pref_female <- pref[pref$gender=='female',]
pref_num_males <- sum(pref_male$dog) +
                  sum(pref_male$cat)
pref_num_females <- sum(pref_female$dog) +
                    sum(pref_female$cat)
dogs_num_male <- sum(pref_male$dog)
dogs_num_female <- sum(pref_female$dog)
cats_num_male <- sum(pref_male$cat)
cats_num_female <- sum(pref_female$cat)

\(\eta^{dog}_{male}=\) 49

\(\eta^{dog}_{female}=\) 37

Experiment #1 - Data By Gender

Male

teacher dog cat gender
1 moseley 12 0 male
3 duffy 17 3 male
5 gill 11 3 male
7 leah-martin 9 0 male

Female

teacher dog cat gender
2 moseley 9 2 female
4 duffy 7 1 female
6 gill 13 1 female
8 leah-martin 8 6 female

Experiment #1 - Bar Plots

Experiment #1 - Proportions

Compute our proportions based on the collected data.

phat_dog_male = dogs_num_male / pref_num_males
phat_dog_female = dogs_num_female / pref_num_females

\(\hat{p}_{male}=\) 0.891

\(\hat{p}_{female}=\) 0.787

Experiment #1 - Significance Test

Pooled Sample Proportion

\[\hat{p_{C}}=\frac{X_{1} + X_{2}}{\eta_{1}+\eta_{2}}\]

p_dog_diff = phat_dog_male - phat_dog_female

\((\hat{p}_{male} - \hat{p}_{female})=\) 0.104

p_dog_pooled = (dogs_num_male + dogs_num_female)/
               (pref_num_males + pref_num_females)

\(\hat{p_{C}}=\) 0.843

Experiment #1 - Significance Test - Continued

  • Two-sample z test for the difference of between two proportions

\[z=\frac{(\hat{p}_{male} - \hat{p}_{female}) - 0}{\sqrt{\frac{\hat{p_{C}}(1-\hat{p_{C}})}{\eta_{male}}+\frac{\hat{p_{C}}(1-\hat{p_{C}})}{\eta_{female}}}}\]

z_dog = (p_dog_diff - 0)/
  (sqrt((p_dog_pooled*(1-p_dog_pooled))/dogs_num_male +
        (p_dog_pooled*(1-p_dog_pooled))/dogs_num_female))

\(z_{dog}=\) 1.309

Experiment #1 - Significance Test - Continued

  • \(\mbox{p-value}_{dog}\) is computed by: \(\mbox{p-value}=2P(z)\)
p_dog = pnorm(z_dog)

\(P_{dog}=\) 0.905

  • \(2(1 - P)\)
dog_p_value = 2 * (1 - p_dog)
  • \(\mbox{p-value}_{dog}=\) 0.191

Because our \(\mbox{p-value}_{dog}\) is greater than \(\alpha=0.05\), we fail to reject \(H{0}\)

Experiment #2 - Collected Data

The following data was collected for large dog preference

teacher large small gender
moseley 11 1 male
moseley 8 1 female
duffy 15 0 male
duffy 5 2 female
gill 9 2 male
gill 8 5 female
leah-martin 8 1 male
leah-martin 8 6 female

Experiment #2 - Summarize Data

Summarize our data for males and females preference for dogs

size_male <- size[size$gender=='male',]
size_female <-size[size$gender=='female',]
size_num_males <- sum(size_male$large) + sum(size_male$small)
size_num_females <- sum(size_female$large) + sum(size_female$small)
size_male_num_large <- sum(size_male$large)
size_female_num_large <- sum(size_female$large)
size_male_num_small <- sum(size_male$small)
size_female_num_small <- sum(size_female$small)

\(\eta^{size}_{male}=\) 47

\(\eta^{size}_{female}=\) 43

Experiment #2 - Data By Gender

Male

teacher large small gender
1 moseley 11 1 male
3 duffy 15 0 male
5 gill 9 2 male
7 leah-martin 8 1 male

Female

teacher large small gender
2 moseley 8 1 female
4 duffy 5 2 female
6 gill 8 5 female
8 leah-martin 8 6 female

Experiment #2 - Bar Plots

Experiment #2 - Proportions

Compute our proportions based on the collected data.

phat_large_male = size_male_num_large / size_num_males
phat_large_female = size_female_num_large/ size_num_females

\(\hat{p}^{large}_{male}=\) 0.915

\(\hat{p}^{large}_{female}=\) 0.674

Experiment #2 - Significance Test

Pooled Sample Proportion

\[\hat{p_{C}}=\frac{X_{1} + X_{2}}{\eta_{1}+\eta_{2}}\]

p_large_diff = phat_large_male - phat_large_female

\((\hat{p}^{large}_{male} - \hat{p}^{large}_{female})=\) 0.24

p_large_pooled =
  (size_male_num_large + size_female_num_large)/
  (size_num_males + size_num_females)

\(\hat{p_{C}}=\) 0.8

Experiment #2 - Significance Test - Continued

  • Two-sample z test for the difference of between two proportions

\[z=\frac{(\hat{p}_{male} - \hat{p}_{female}) - 0}{\sqrt{\frac{\hat{p_{C}}(1-\hat{p_{C}})}{\eta_{male}}+\frac{\hat{p_{C}}(1-\hat{p_{C}})}{\eta_{female}}}}\]

z_large = (p_large_diff - 0)/
  (
    sqrt(
    p_large_pooled*(1-p_large_pooled)/size_num_males +
    p_large_pooled*(1-p_large_pooled)/size_num_females
    )
  )

\(z_{large}=\) 2.849

Experiment #2 - Significance Test - Continued

  • \(\mbox{p-value}_{large}\) is computed by: \(\mbox{p-value}=2P(z)\)
p_large = pnorm(z_large)

\(P_{large}=\) 0.998

  • \(2(1 - P)\)
p_value_large = 2 * (1 - p_large)
  • \(\mbox{p-value}_{large}=\) 0.004

Because our \(\mbox{p-value}_{large}\) is smaller than \(\alpha=0.05\), we reject \(H{0}\)

Conclusion

This section will address the originally posed questions in the abstract.

  • Which pet(dog or cat) do Leigh High School students prefer?

The collected data strongly indicates there is an overwhelming preference for dogs over cats. Out of the total number of students sampled of 102, 86 preferred dogs over cats, or 84%.

  • Is there a significant difference in the proportion of male and female students who prefer dogs?

From the collected data the difference between the male and female proportions who preferred dogs was 0.104 which was relatively small when compared that the overall preference proportion was 0.843 (12.3%). The statistical test supports our null hypothesis that the male and female proportions are the same accounting for variance based on a normal distribution (\(\mbox{p-value}_{dog}=\) 0.191 is greater than \(\alpha=0.05\)).

Conclusion - Continued

  • For those students who selected dogs as their preferred pet, determine which size of dog is preferred?

80% of the students who selected dogs preferred a large dog over a small one.

  • Is there a significant difference in the proportion of male and female students who prefer large dogs?

The difference in proportions between male and female students who preferred larges dogs was significant at 0.24 or 30.1% of the overall proportions of students who preferred large dogs. The statistical test (\(\mbox{p-value}_{large}=\) 0.00439 is much less than \(\alpha=0.05\)) supported the rejection of the null hypothesis for the alternative hypothesis which was that the proportions of male and female students who prefer large dogs are different.

Appendix I - Raw Data