Contingency Tables and Two Sample Tests of Proportions
2025-10-27
Today’s plan 📋
Comments and Questions about Previous Lecture from Engagement Questions
Upcoming Dates
A few minutes for R Questions 🪄
Review Question - Two-sided test
Review of One Sample Proportion Hypothesis Tests
Contingency Tables
Tests of Two proportions
Format of Hypothesis Tests
HW 7 is now posted and is due 11/5 (Grace period ends 11/6).
Test 2 is on November 11th and will include material up through Lecture 20 (HW 7)
Lecture 21 - Intro to Portfolio Management will be on Final Exam, not on Test 2.
In this course we will use R and RStudio to understand statistical concepts.
You will access R and RStudio through Posit Cloud.
I will post R/RStudio files on Posit Cloud that you can access in provided links.
I will also provide demo videos that show how to access files and complete exercises.
NOTE: The free Posit Cloud account is limited to 25 hours per month.
For those who want to go further with R/RStudio:
If you are interested in downloading R and RStudio to your own computer, I can guide you through the process.
The software is completely free but it does have to be updated a couple times each year.
Poll Everywhere - My User Name: penelopepoolereisenbies685
Question of Interest polled by YouGov:
987 adults in the US were asked:
Would you like to see the changing of the clocks eliminated, so people no longer change their clocks twice per year?
YouGov Polled 987 US adults
612 said YES, we should eliminate the practice of changing our clocks.
375 said NO or they were unsure. We group these two categories together.
If we test these data, what are the null and alternative hpotheses:
Specify alpha (\(\alpha\)) as 0.05 unless we have a specific reason to choose a different alpha.
1-sample proportions test without continuity correction
data: 612 out of 987, null probability 0.5
X-squared = 56.909, df = 1, p-value = 0.00000000000004565
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.5893699 0.6498207
sample estimates:
p
0.6200608
Hypotheses being tested:
\(H_{0}: P_{YES} = 0.5\)
\(H_{A}: P_{YES} \neq 0.5\)
P-value from hypotheses test: < 0.0001
Conclusion: P-value is much less than 0.05 so we REJECT \(H_{0}\).
Interpretation: See Polling Question on next slide
Poll Everywhere - My User Name: penelopepoolereisenbies685
Given our stated hypotheses and our p-value < 0.0001
\(H_{0}: P_{YES} = 0.5\)
\(H_{A}: P_{YES} \neq 0.5\)
How do we interpret the outcome of this hypothesis test?
Question: Are these disparities in opinions about daylight savings consistent among age groups?
We can examine this question using tables, plots, and hypothesis tests.
A Contingency Table is 2 x 2 or larger and allows us to subdivide count data by categories
Commonly used in market research to understand opinions by category
Example: How do Gen Z (18-29) and Millennial adults feel (30-44) about daylight savings?
| Yes | No/Not Sure | |
|---|---|---|
| Ages 18-29 | 99 | 98 |
| Ages 30-44 | 129 | 105 |
Contingency tables and bar plots are two effective ways to examine these data
Hypotheses being tested (Usually a two-sided tests):
2-sample test for equality of proportions without continuity correction
data: x out of n
X-squared = 1.0199, df = 1, p-value = 0.3125
alternative hypothesis: two.sided
95 percent confidence interval:
-0.14327319 0.04578523
sample estimates:
prop 1 prop 2
0.5025381 0.5512821
Hypotheses being tested:
Questions we will answer:
What is the p-value from this test?
Do we Reject or Fail to Reject the Null Hypothesis?
What do we conclude about the opinions of these two age groups?
Poll Everywhere - My User Name: penelopepoolereisenbies685
Question 3:
Question 4:
Poll Everywhere - My User Name: penelopepoolereisenbies685
Question 5:
Question 6:
Original Data
| Yes | No/Not Sure | |
|---|---|---|
| Ages 18-44 | 228 | 205 |
| Ages 45-64 | 201 | 118 |
Row Percentages: Percentages of each age group that said ‘Yes’ or ‘No’.
| Yes | No/Not Sure | |
|---|---|---|
| Ages 18-44 | 52.66 | 47.34 |
| Ages 45-64 | 63.01 | 36.99 |
Column percentages: Percentages of Yes/No opinions in each age group.
| Yes | No/Not Sure | |
|---|---|---|
| Ages 18-44 | 53.15 | 63.47 |
| Ages 45-64 | 46.85 | 36.53 |
Hypotheses being tested:
2-sample test for equality of proportions without continuity correction
data: x out of n
X-squared = 8.0355, df = 1, p-value = 0.004587
alternative hypothesis: two.sided
95 percent confidence interval:
-0.17437592 -0.03269438
sample estimates:
prop 1 prop 2
0.5265589 0.6300940
Poll Everywhere - My User Name: penelopepoolereisenbies685
Question 7:
Protocol for conducting and interpreting hypothesis tests is same, regardless of how they are specified.
This is true for quantitative data and for categorical proportion data
For two sample tests of proportions, it is helpful to examine the data using contingency tables.
By default, it is common for two sample tests of proportions to be conducted as two sided tests.
These same methods can be used with larger contingency tables tat are interatively analyzed.
To submit an Engagement Question or Comment about material from Lecture 19: Submit it by midnight today (day of lecture).