We are studying a family of tests in statistics named, the t-tests. This time, we can look at introducing more than one group of individuals to study. We are learning how to look at groups statistically. This chapter builds from chapters 6, 7, and 8.
Remember when I mentioned that we should be able to remember what equation we need to use for the data that we have. The equation we use is always decided by the data!
Independent Samples (Independent Drawn Samples) [PAGE 273]
Dependent Samples: Matched Pairs [PAGE 275]
Dependent Samples: Matched Pairs (paired difference) t test [PAGE 275]
SAMPLE 1 | SAMPLE 2 |
---|---|
Size \(n_1\) | Size \(n_2\) |
Mean \(\mu_1\) | Mean \(\mu_2\) |
Variance \(s_1^2\) | Variance \(s_2^2\) |
\(\bigg\uparrow\) | $ \bigg\uparrow$ |
POP1 | POP 2 |
\(\mu_1\) unknown | \(\mu_2\) unknown |
\(\sigma_1^2\) unknown | \(\sigma_2^2\) unknown |
THESE ARE NOT MATCHED AT ALL!!!!
PLACEBO | MEDICATION |
---|---|
Size \(n_1=6\) | Size \(n_2=5\) |
Mean \(\bar{x_1}\) | Mean \(\bar{x_2}\) |
Variance \(s_1^2\) | Variance \(s_2^2\) |
\(\bigg\uparrow\) | \(\bigg\uparrow\) |
POPULATION 1 | POPULATION 2 |
\(\mu_1\) is unknown | \(\mu_2\) is unknown |
\(\sigma_1^2\) is unknown | \(\sigma_2^2\) is unknown |
STEP ONE: Write out \(H_0\) and \(H_1\) * WE USE DIRECTIONAL BECAUSE OF PREVIOUS INFORMATION FROM THE MARKETING DEPARTMENT. * Experimental group saw the commercial * Control group did not see the commercial * \(H_0: \mu_{experimental} = \mu_{control}\) * \(H_1: \mu_{experimental} > \mu_{control}\)
STEP TWO: Determine \(n\), \(\bar{x}\), and \(s^2\) for each group
Experimental Group | Control Group |
---|---|
\(x_{experimental}\) | \(x_{control}\) |
10 | NA |
6 | 8 |
8 | 3 |
7 | 5 |
9 | 6 |
7 | 7 |
STEP THREE: Calculate the Variance (from scratch, these may or may not be provided to you in an exam or homework...know this skill even if you don't need it for a particular problem.) * Square the raw scores
Experimental Group | Control Group |
---|---|
\(x_{experimental}\) | \(x_{control}\) |
100 | NA |
36 | 64 |
64 | 9 |
49 | 25 |
81 | 36 |
49 | 49 |
* For Experimental Variance
\(\Sigma x_1^2 = 379\)
\(s_1^2 = \frac{\Sigma x_1^2 - \frac{(\Sigma x_1)^2}{n_1}}{n_1}\) =\(s_1^2 = \frac{379 - \frac{(47)^2}{6}}{6}\) = \(s_1^2 = \frac{379 - \frac{2206}{6}}{6}\) =\(s_1^2 = \frac{379 - 368.17}{6}\) =\(s_1^2 = 1.81\)
* For Control Variance
\(\Sigma x_2^2= 183\)
\(s_2^2 = \frac{\Sigma x_2^2 - \frac{(\Sigma x_2)^2}{n_2}}{n_2}\) =\(s_2^2 = \frac{183 - \frac{(29)^2}{5}}{5}\) = \(s_2^2 = \frac{183 - \frac{841}{5}}{5}\) =\(s_2^2 = \frac{183 - 168.2}{5}\) = \(s_2^2 = 2.96\)
Experimental | Control |
---|---|
\(n_1=6\) | \(n_2=5\) |
\(\bar{x_1}=7.83\) | \(\bar{x_2}=5.80\) |
\(s_1^2 = 1.81\) | \(s_2^2 = 2.96\) |
\(H_0: \sigma_1^1 = \sigma_2^1\) and \(H_1: \sigma_1^1 \neq \sigma_2^1\)
\(F = \frac{s^2_{larger}}{s^2_{smaller}}\) = \(F = \frac{2.96}{1.81}\) = \(F = 1.64\) with
\(df_{numerator}\) and \(df_{denominator}\) ==> This is 2 steps!
\(df_{numerator}=n_1 - 1 = 5-1 = 4\) and \(df_{denominator}=n_2 - 1 = 6-1 = 5\)
Result
\(F_{OBTAINED} = 1.64\) and \(F_{CRITICAL} = 5.19\) with \(df=(4,5)\) and our final decision for the equation to use is:
\(F_{OBTAINED} = 1.64 < F_{CRITICAL 0.5} = 5.19 (df=4 and 5)\)
We do NOT REJECT THE Null and conclude that we use the
t test for EQUAL Population Variances
rarely do we have to change our test statistic because F is calculated from the sample or the population
\(t = \frac{\bar{x_1} - \bar{x_2}}{\sqrt{\bigg(\frac{n_1s_1^2 +n_2s_2^2}{n_1+n_2-2}\bigg)\bigg(\frac{1}{n_1}+\frac{1}{n_2}\bigg)}}\)
Where \(df = n_1 + n_2 = 2\)
PRO TIP: "Solve this big equation by finding the components and then plugging them into the equation at the end."
\(\bar{x}_1 - \bar{x}_2 = 7.83 - 5.80 = 2.03\)
\(\frac{n_1s_1^2 + n_2s_2^2}{n_1+n_2-2}\) = \(\frac{6(1.81)+5(2.96)}{6+5-2}\) = \(\frac{10.86+14.80}{11-2}\)=\(\frac{25.66}{9}=2.85\)
\(\frac{1}{n_1}+\frac{1}{n_2}\)=\(\frac{1}{6}+\frac{1}{5}\)= \(0.17+0.20 = 0.37\)
\(\sqrt{\bigg( \frac{n_1s_1^2 + n_2s_2^2 }{n_1+n_2-2}\bigg) \bigg( \frac{1}{n_1} + \frac{1}{n_2} \bigg)}\) = \(\sqrt{(2.85)(0.37)}\) = \(\sqrt{1.0545}\) = \(1.0268\)
\(t = \frac{\bar{x_1} - \bar{x_2}}{\sqrt{\bigg(\frac{n_1s_1^2 +n_2s_2^2}{n_1+n_2-2}\bigg)\bigg(\frac{1}{n_1}+\frac{1}{n_2}\bigg)}}\) = \(\frac{2.03}{1.0268}\) = \(1.9708\) = \(t_{obtained}=1.971\)
Degrees of Freedom \(df=n_1+n_2-2 = 6+5-2 =11-2 = 9\)
COMPARE t test calculate to critical for decision on rejecting the null hypothesis
Decisions: remember we are using DIRECTIONAL here
We use the One Tail ONLY FOR THIS LONG EXAMPLE. We conclude the following:
We reject \(H_0\) in favor of the alternative \(H_1\) such that \(\mu_1 > \mu_2\). This particular commercial does increase favor ability ratings of the product! We tell the marketing department that if the assumption that the entire consumer population had viewed the commercial then their mean support score would increase.
What if our example had different scores? Let's revisit the problem with some new scores and see what happens when we have UNEQUAL VARIANCES.
Experimental Group | Control Group |
---|---|
\(x_{experimental}\) | \(x_{control}\) |
10 | NA |
6 | 10 |
8 | 1 |
7 | 3 |
9 | 6 |
7 | 9 |
Calculate the Variance (from scratch, these may or may not be provided to you in an exam or homework...know this skill even if you don't need it for a particular problem.)
Experimental Group | Control Group |
---|---|
\(x_{experimental}\) | \(x_{control}\) |
100 | NA |
36 | 100 |
64 | 1 |
49 | 9 |
81 | 36 |
49 | 81 |
* For Experimental Variance
\(\Sigma x_1^2 = 379\)
\(s_1^2 = \frac{\Sigma x_1^2 - \frac{(\Sigma x_1)^2}{n_1}}{n_1}\) = \(s_1^2 = \frac{379 - \frac{(47)^2}{6}}{6}\) = \(s_1^2 = \frac{379 - \frac{2206}{6}}{6}\)
\(s_1^2 = \frac{379 - 368.17}{6}\) =\(s_1^2 = 1.81\)
* For Control Variance
\(\Sigma x_2^2= 227\)
\(s_2^2 = \frac{\Sigma x_2^2 - \frac{(\Sigma x_2)^2}{n_2}}{n_2}\) = \(s_2^2 = \frac{227 - \frac{(29)^2}{5}}{5}\)
\(s_2^2 = \frac{227 - \frac{841}{5}}{5}\) = \(s_2^2 = \frac{227 - 168.2}{5} = \frac{58.8}{5} = 11.76\)
Experimental | Control |
---|---|
\(n_1=6\) | \(n_2=5\) |
\(\bar{x_1}=7.83\) | \(\bar{x_2}=5.80\) |
\(s_1^2 = 1.81\) | \(s_2^2 = 11.76\) |
\(H_0: \sigma_1^1 = \sigma_2^1\) and \(H_1: \sigma_1^1 \neq \sigma_2^1\)
\(F = \frac{s^2_{larger}}{s^2_{smaller}}\) = \(F = \frac{11.76}{1.81}\) = \(F = 6.497\) with \(df_{numerator}\) and \(df_{denominator}\) ==> This is 2 steps!
\(df_{numerator}=n_1 - 1 = 5-1 = 4\) and \(df_{denominator}=n_2 - 1 = 6-1 = 5\)
Result
\(F_{OBTAINED} = 6.497\) and \(F_{CRITICAL} = 5.19\) with \(df=(4,5)\) and our final decision for the equation to use is:
\(F_{OBTAINED} = 6.497 > F_{CRITICAL 0.5} = 5.19 (df=4 and 5)\)
We REJECT THE Null and conclude that we use the
t test for UNEQUAL Population Variances
rarely do we have to change our test statistic because F is calculated from the sample or the population
Again, we calculate the test statistic: t Test BUT THIS TIME WE USE THE UNEQUAL POPULATION VARIANCES
\(t = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1-1} + \frac{s_2^2}{n_2-1}}}\)
\(df_{exact}=\frac{\big(\frac{s_1^2}{n_1 - 1}+\frac{s_2^2}{n_2-2}\big)^2}{\Bigg[\frac{\big(\frac{s_1^2}{n_1-1}\big)^2}{\big(n_1-1\big)}\Bigg]+\Bigg[\frac{\big(\frac{s_2^2}{n_2-1}\big)^2}{\big(n_2-1\big)}\Bigg]}\)
\(\bar{x}_1 - \bar{x}_2 = 7.83 - 5.80 = 2.03\)
\(\frac{s_1^2}{n_1-1}=\frac{1.81}{6-1}=\frac{1.81}{5}=0.36\) \(\frac{s_2^2}{n_2-1}=\frac{11.76}{5-1}=\frac{11.76}{4}=2.94\)
\(\sqrt{\frac{s_1^2}{n_1-1} +\frac{s_2^2}{n_2-1}}=\sqrt{0.36 + 2.94}=\sqrt{3.3}=1.82\)
\(t = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1-1} + \frac{s_2^2}{n_2-1}}}\)
\(t = \frac{2.03}{1.82}=1.115\)
Decisions: \(df=n_2=5\) or you can use the exact df equation.
COMPARE t test calculate to critical for decision on rejecting the null hypothesis
\(t_{obtained=1.115}\)
\(t_{critical}\) is for a DIRECTIONAL \(H_1\) at \(\alpha=0.5\)
Decisions: remember we are using DIRECTIONAL here
We use the One Tail ONLY FOR THIS LONG EXAMPLE. We conclude the following:
We CANNOT reject \(H_0\) in favor of the alternative \(H_1\) such that \(\mu_1 > \mu_2\). This particular commercial does NOT increase favor ability ratings of the product! We tell the marketing department that if the assumption that the entire consumer population had viewed the commercial then their mean support score would NOT increase.
The equation for this procedure is:
\(df_{exact}=\frac{\big(\frac{s_1^2}{n_1 - 1}+\frac{s_2^2}{n_2-2}\big)^2}{\Bigg[\frac{\big(\frac{s_1^2}{n_1-1}\big)^2}{\big(n_1-1\big)}\Bigg]+\Bigg[\frac{\big(\frac{s_2^2}{n_2-1}\big)^2}{\big(n_2-1\big)}\Bigg]}\)
To solve it, using the previous example, we would find the components piece by piece and then plug them in to solve the equation.
The final df = 4
We still have the same result, but sometimes this is a slight issue. Go with the computer generated values for most of the time. Rarely, do we ever do this by hand again. Well, unless our computers fail!
Result: \(t_{critical}\), one-tailed, \(0.05 \alpha\) level, with \(df=4\)
2.32 > 1.15 and we CAN NOT REJECT THE \(H_0\) IN FAVOR OF THE \(H_1\)
F Test for Homogeneity of Variables
\(F = \frac{s^2_{larger}}{s^2_{smaller}}\)
\(df\) numerator = \(n-1\) for the sample with the larger variance
\(df\) denominator = \(n-1\) for the sample with the smaller variance
Equal Population Variances Assumed
\(t = \frac{\bar{x_1} - \bar{x_2}}{\sqrt{\bigg(\frac{n_1s_1^2 +n_2s_2^2}{n_1+n_2-2}\bigg)\bigg(\frac{1}{n_1}+\frac{1}{n_2}\bigg)}}\)
Unequal Population Variances Assumed
\(t = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1-1} + \frac{s_2^2}{n_2-1}}}\)
\(df_{exact}=\frac{\big(\frac{s_1^2}{n_1 - 1}+\frac{s_2^2}{n_2-2}\big)^2}{\Bigg[\frac{\big(\frac{s_1^2}{n_1-1}\big)^2}{\big(n_1-1\big)}\Bigg]+\Bigg[\frac{\big(\frac{s_2^2}{n_2-1}\big)^2}{\big(n_2-1\big)}\Bigg]}\)
Equal Population Variances Assumed
\(t=\frac{\bar{x_1}-\bar{x_2}}{\sqrt{\Bigg[ \frac{(n_1 - 1)\hat{\sigma_1}^2+(n_2-1)\hat{\sigma_2}^2}{n_1+n_2-2} \Bigg]\Bigg[ \frac{1}{n_1} + \frac{1}{n_2}\Bigg]}}\)
Unequal Population Variances Assumed
\(t = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{\hat{\sigma}_1^2}{n_1}+\frac{\hat{\sigma}^2_2}{n_2}}}\)
\(df_{exact}=\frac{\bigg( \frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_1} \bigg)^2}{\Bigg[\frac{\bigg( \frac{\sigma^2_1}{n_1}\bigg)^2}{n_1-1} \Bigg] + \Bigg[ \frac{\bigg( \frac{\sigma^2_2}{n_2}\bigg)^2}{n_2-1}\Bigg]}\)
Dependent Samples
\(t = \frac{\bar{D}}{\frac{S_D}{\sqrt{n_p -1}}}\)
and \(df=n_p -1\)
/
#