(to temper expectations)
Pearson’s correlation measures how related two vectors are to each other on a scale from \(r = -1\) to \(r = 1\)
“In statistical hypothesis testing, the p-value or probability value is the probability of obtaining test results at least as extreme as the results actually observed, assuming that the null hypothesis is correct” (Wikipedia) Many scientists and research journals are taught to seek p-values < 0.05
(image credit: “It’s the Effect Size, Stupid!”)
We are going to simulate a comparison of garbanzo beans and chickpeas.
Suppose we compare the two samples. What happens if we repeatedly use those two samples?
In 1977, Jacob Cohen suggested the following formula (that does not depend on sample size) measure to compare means:
\[d = \frac{|\mu_{1} - \mu_{2}|}{s_{p}}\]
where \(s_{p}\) is the pooled standard deviation.
The p-value from a t-test will be less than 0.05, but it is still concerning that the p-value tends to decrease as sample size increases.
In 1981, Hedges suggested the following update measure to compare means:
\[d = \frac{|\mu_{1} - \mu_{2}|}{s_{p}} \cdot \frac{N - 3}{N - 2.25} \sqrt{\frac{N-2}{N}} \]
where \(s_{p}\) is the pooled standard deviation. This correction is said to be better when handling sample sizes < 20.
In Spring 2020,
Goal: detect gender bias in final semester grades (if any)
Idea: Use the R gender package to automate the process to classify gender.
Setup:
Students in Math 32 did a homework assignment where one task was to perform a hypothesis test on survey results from the survey question, “On a scale from 0 = Democrat to 100 = Republican, where are your political leanings?”.
Instructor’s solution: “Since the p-value < 0.05, we reject the claim of an unbiased student population at the alpha = 0.05 significance level.”
Task: measure how “similar” each students’ response was to the instructor’s solution, and then assign grades.
Let \(x\) and \(y\) be two “sentence vectors” in a “sentence vector space”. The cosine distance is
\[\cos(x,y) = \frac{x \cdot y}{||x||||y||}\]
Disclaimers:
In this example, the computer algorithm matched my grading on 76% (38/50) of the observations.