Chapter 10 One Way ANOVA

Sherri Verdugo, M.S.

Instructor, CSUF


Prologue & Introduction

The Analysis of Variance (ANOVA) can be used for more than one group. We are not limited to just one group. ANOVA is a powerful tool.

ANOVA is the statistical cousin to the t test.

  • A technique for comparing sample means.
  • Can be used for more than two means.
  • Friendly to experimental testing.
  • Several sample means are being compared at once.
  • Overview of the process.
  • Can be used to evaluate statistical tests.

How Analysis of Variance is Used

Design for non experimental application with two variables:

 * One is interval level of measurement (usually the dependent)
 * Grouped data of any level of measurement. 
 * ANOVA is non-directional...meaning only non-directional hypothesis apply.
 

Example of ANOVA Concept Introduction

Abortion stance by rural and urban settings

Example of a non-experimental setting

Rural Group 1 Urban Group 2
160 100
130 110
150 130
140 120
\(\Sigma X_1=580\) \(\Sigma X_2 =460\)
\(\bar{X}_1=145\) \(\bar{X}_2=115\)
  • Means differ
  • Hypothesis: \(H_0: \mu_1=\mu_2\) and \(H_A: \mu_1 \neq \mu_2\)
  • ANOVA is NON-DIRECTIONAL
  • Calculate the \(F\) ratio (named Fisher, the original development)
  • Compare the \(F\) to \(F_{critical}\) at \(\alpha=0.5\)

Analysis of Variance in Experimental Settings

Example 1: random groups

Experiments are the backbone of science. ANOVA is the most common used statistical technique. Participants or subjects are the people being studied in an experiment.

  • Scenario: Using a random sampling assignment of 30 subjects are assigned to one of three groups. Each group has 10 subjects. Because of the random assignment, any difference in mean satisfaction scores among the three groups will be the result of the time frame for the course (day, night, and Saturday) rather than any other factors.

  • Hypothesis for variable selection:

\(H_0:\mu_{day}=\mu_{night}=\mu_{saturday}\)

\(H_A1:\mu_{day} \neq \mu_{night}\), \(H_A2:\mu_{day} \neq \mu_{saturday}\), \(\mu_{night} \neq \mu_{saturday}\)

  • Question: are the means different from each other to conclude they are not related to sampling error.

ANOVA example 1 continued: DATA

Day Night Saturday
5 5 5
5 4 5
5 4 5
5 3 4
4 3 4
4 2 4
4 2 3
4 2 3
3 1 3
2 0 2

F: an intuitive friendly approach

Case 1: Day versus Night

Day Night
5 5
5 5
5 5
3 3
3 3
3 3
\(\Sigma x_D =24\) \(\Sigma x_D =24\)
\(\bar{x}_D=24/6=4\) \(\bar{x}_D=24/6=4\)
  • The Grand Mean = \(\Sigma G_M=\frac{48}{12}=4.00\)
  • We can use the F statistic here.

F: an intuitive friendly approach

Case 2: Day versus Night New Scores

Day Night
5 5
5 5
5 3
5 3
3 3
3 3
\(\Sigma x_D =26\) \(\Sigma x_D =22\)
\(\bar{x}_D=26/6=4.33\) \(\bar{x}_D=22/6=3.67\)
  • The Grand Mean = \(\Sigma G_M=\frac{48}{12}=4.00\)
  • F = 1.248
  • The classes now differ a bit because of the satisfaction rating.

F: an intuitive friendly approach

Case 3: Day versus Night New Scores

Day Night
5 5
5 3
5 3
5 3
5 3
3 3
\(\Sigma x_D =28\) \(\Sigma x_D =20\)
\(\bar{x}_D=28/6=4.67\) \(\bar{x}_D=20/6=3.33\)
  • The Grand Mean = \(\Sigma G_M=\frac{48}{12}=4.00\)
  • F = 7.994
  • The classes now differ a bit because of the satisfaction rating.

F: an intuitive friendly approach

Case 4: Day versus Night New Scores

Day Night
5 3
5 3
5 3
5 3
5 3
5 3
\(\Sigma x_D =30\) \(\Sigma x_D =18\)
\(\bar{x}_D=30/6=5.00\) \(\bar{x}_D=18/6=3.00\)
  • The Grand Mean = \(\Sigma G_M=\frac{48}{12}=4.00\)
  • F = mathematically undefined…it is getting larger though and approaching infinity as a limit.

Terminology: Getting to know ANOVA

Some new concepts are needed to deal with the calculations.


Terminology: Getting to know ANOVA


Terminology: Getting to know ANOVA

Summary Recap for Sum of Squares

  • \(SS_B\) = the portion of \(SS_T\) accounted for by the categories of the independent variable.
  • \(SS_T\) = the portion of \(SS_T\) not accounted for by the categories of the independent variable.
  • Additive: \(SS_T = SS_B + SS_W\)
  • We use both \(SS_T\) and \(SS_B\) to form two separate population variance estimates, these result in the between groups mean squares and between groups degrees of freedom.

Terminology: Getting to know ANOVA

Summary Recap for Mean Squares

  • Between-Groups Mean Square: A variance estimate based on the between-groups sum of squares. \(MS_{BETWEEN}=\frac{SS_B}{df_B}=\frac{SS_B}{num.categories - 1}\)
  • Degrees of Freedom, between-groups Mean Squares: \(df_{between} = num.categories -1\)
  • Within-Groups Mean Squares: A population variance estimate based on what the categories of the independent variable do not explain –the variation of scores within the groups: \(MS_{Within} = \frac{SS_{Within}}{df_{Within}} = \frac{SS_{Within}}{(n_{Total}-1)-(num.categories - 1)}\)
  • **Degrees of Freedom, Within-Groups Mean: That portion of the total degrees of freedom not accounted for by the number of groups studied. \(df_W=(n_{Total}-1) - (num.categories - 1)\)

Terminology: Getting to know ANOVA

Summary Recap for Mean Squares

\(df_W=df_{total}-df{between}=(n-1)-(num.categories)-1\)

So what is \(MS_W\)?

  • \(MS_{Within} = \frac{SS_{Within}}{df_{Within}} = \frac{SS_{Within}}{(n_{Total}-1)-(num.categories - 1)}\)
  • The degrees of freedom are additive:
    • \(df_{total}=df_B+df_W\)

Ok, so what is \(F\)?

\(F=\frac{MS_B}{MS_W}\)

  • Cool, but what if I get lost in the calculations???

Procedure: Calculating One-Way ANOVA


Procedure: Calculating One-Way ANOVA


Procedure: Calculating One-Way ANOVA


One Way ANOVA EXAMPLE:

x = Pro-Life Index

\(x_1\)=Rural \(x_2\)=Urban
160 100
130 110
150 130
130 120
\(n_1 = 4\) \(n_2 = 4\)
\(\Sigma x_1=580\) \(\Sigma x_2=460\)
  • Null hypothesis: \(H_0: \mu_1 = \mu_2\)
  • Alt. hypothesis: \(H_A: \mu_1 \neq \mu_2\)
  • \(N.total = 4+4=8\)

One Way ANOVA EXAMPLE:

x = Pro-Life Index

  • Step 1: calculate the grand mean components and grand mean
    • \(X_1=\frac{\Sigma x_1}{n_1}=\frac{580}{4}=145\)
    • \(X_2=\frac{\Sigma x_2}{n_2}=\frac{460}{4}=115\)
    • \(\Sigma X_{total}= \Sigma x_1 +\Sigma x_2 = 580 + 460 = 1040\)
    • n.total = 8 = 4 + 4 (4 entries from rural and 4 entries from urban)
    • Grand Mean: \(\frac{\Sigma X_{total}}{n.total}=\frac{1040}{8}=130\)

One Way ANOVA EXAMPLE:

x = Pro-Life Index

  • Step 2: calculate squares totals
\(x_1^2\) \(\&nbsp\) \(x_2^2\)
\(160^2=25,600\) \(\&nbsp\) \(\&nbsp\) \(100^2=10,000\)
\(130^2=16,900\) \(\&nbsp\) \(\&nbsp\) \(110^2=12,100\)
\(150^2=22,500\) \(\&nbsp\) \(\&nbsp\) \(130^2=16,900\)
\(140^2=19,600\) \(\&nbsp\) \(\&nbsp\) \(120^2=14,400\)
\(\Sigma x_1^2=84,600\) \(\&nbsp\) \(\&nbsp\) \(\Sigma x_2^2=53,400\)
  • Prepare to find the rest of the components by finding
    • \(\Sigma X^2_{Total} = \Sigma x_1^2 + \Sigma x_2^2 = 84,600 + 53,400 = 138,000\)

Let’s calculate F


Let’s calculate F


Let’s calculate F


Let’s calculate F


Wrap up our conclusions in a table

Source \(\&nbsp\) \(\&nbsp\) SS \(\&nbsp\) \(\&nbsp\) df \(\&nbsp\) MS \(\&nbsp\) F \(\&nbsp\) P
TOTAL 2,800
Between 1,800 \(\&nbsp\) \(\&nbsp\) 1 \(\&nbsp\) 1,800 \(\&nbsp\) 10.80 \(\&nbsp\)<.05
Within 1,000 \(\&nbsp\) \(\&nbsp\) 6 \(\&nbsp\) 166.67

Comparing \(t\) with \(F\)


Analysis of Variance: Exp. Data

Day Night Saturday
5 5 5
5 4 5
5 4 5
5 3 4
4 3 4
4 2 4
4 2 3
4 2 3
3 1 3
2 0 2
\(n_1=10\) \(n_2=10\) \(n_3=10\)
\(\Sigma X_1=41\) \(\Sigma X_2=26\) \(\Sigma X_3=38\)
\(\bar{x_1}=4.1\) \(\bar{x_2}=2.6\) \(\bar{x_3}=3.8\)

Analysis of Variance: Exp. Data


Find the squares

\(Day^2\) \(\&nbsp\) \(\&nbsp\)\(Night^2\) \(\&nbsp\) \(\&nbsp\)\(Saturday^2\)
\(5^2=25\) \(\&nbsp\) \(\&nbsp\) \(5^2=25\) \(\&nbsp\) \(\&nbsp\) \(5^2=25\)
\(5^2=25\) \(\&nbsp\) \(\&nbsp\) \(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(5^2=25\)
\(5^2=25\) \(\&nbsp\) \(\&nbsp\) \(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(5^2=25\)
\(5^2=25\) \(\&nbsp\) \(\&nbsp\) \(3^2=9\) \(\&nbsp\) \(\&nbsp\) \(4^2=16\)
\(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(3^2=9\) \(\&nbsp\) \(\&nbsp\) \(4^2=16\)
\(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(2^2=4\) \(\&nbsp\) \(\&nbsp\) \(4^2=16\)
\(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(2^2=4\) \(\&nbsp\) \(\&nbsp\) \(3^2=9\)
\(4^2=16\) \(\&nbsp\) \(\&nbsp\) \(2^2=4\) \(\&nbsp\) \(\&nbsp\) \(3^2=9\)
\(3^2=9\) \(\&nbsp\) \(\&nbsp\) \(1^2=1\) \(\&nbsp\) \(\&nbsp\) \(3^2=9\)
\(2^2=4\) \(\&nbsp\) \(\&nbsp\) \(0^2=0\) \(\&nbsp\) \(\&nbsp\) \(2^2=4\)

Find the squares


ANOVA STEPS


ANOVA STEPS


ANOVA STEPS


Post Hoc Testing


Post Hoc Testing: Steps

Source \(\&nbsp\) \(\&nbsp\) SS \(\&nbsp\) \(\&nbsp\) df \(\&nbsp\) MS \(\&nbsp\) F \(\&nbsp\) P
TOTAL 51.50
Between 12.60 \(\&nbsp\) \(\&nbsp\) 2 \(\&nbsp\) 6.30 \(\&nbsp\) 4.375 \(\&nbsp\)<.05
Within 38.90 \(\&nbsp\) \(\&nbsp\) 27 \(\&nbsp\) 1.44

\((\bar{X}_i - \bar{X}_j) = \pm \sqrt{(df_B)(F_{critical})(MS_W)(\frac{1}{n_i} + \frac{1}{n_j})}\)


Post Hoc Testing: Steps

From our example:

  • \((\bar{X}_i - \bar{X}_j) = \pm \sqrt{(2)(3.35)(1.44)(\frac{1}{10}+\frac{1}{10})}\)
  • \(\pm \sqrt{1.9296}=\pm 1.389\)
  • So how does this work:
Null \(absolute{\bar{x}_i-\bar{x}_j}\) Scheffe’s Crit. Conclusions
\(\mu_D=\mu_N\) \(absolute{4.1-2.6} =1.5\) \(> 1.389\) Reject \(H_0\)
\(\mu_D=\mu_S\) \(absolute{4.1-3.8} =0.3\) \(< 1.389\) Cannot reject \(H_0\)
\(\mu_N=\mu_S\) \(absolute{2.6-3.8} =1.2\) \(< 1.389\) Cannot reject \(H_0\)

SPSS applications


Two-Way ANOVA

Input

Day Night Saturday
5 5 5
5 4 5
5 4 5
5 3 4
4 3 4
4 2 4
4 2 3
4 2 3
3 1 3
2 0 2
\(n_1=10\) \(n_2=10\) \(n_3=10\)

Two-Way ANOVA

Output Summary

Groups Count Sum Average Variance
Column 1 10 41 4.1 0.988889
Column 2 10 26 2.6 2.266667
Column 3 10 38 3.8 1.066667
  • Column 1 = Day
  • Column 2 = Night
  • Column 3 = Saturday

Two Way ANOVA

Anova table

  • Summarize the ANOVA source table from the previous problem.
Source \(\&nbsp\) \(\&nbsp\) SS \(\&nbsp\) \(\&nbsp\) df \(\&nbsp\) MS \(\&nbsp\) F \(\&nbsp\) P F crit
Between 12.6 \(\&nbsp\) \(\&nbsp\) 2 \(\&nbsp\) 6.30 \(\&nbsp\) 4.375 \(\&nbsp\) 0.022642 \(\&nbsp\) 3.354131
Within 38.9 \(\&nbsp\) \(\&nbsp\) 27 \(\&nbsp\) 1.440741
TOTAL 51.5 \(\&nbsp\) \(\&nbsp\) 29

What about interactions??

We look at the means and that indicates that an interaction might be occurring. We run this on SPSS because the calculations can become daunting!

Day Night Saturday
5 \(\&nbsp\) 5 \(\&nbsp\) 5
5 \(\&nbsp\) 4 \(\&nbsp\) 5
5 \(\&nbsp\) 4 \(\&nbsp\) 5
5 \(\&nbsp\) 3 \(\&nbsp\) 4
4 \(\&nbsp\) 3 \(\&nbsp\) 4
\(\bar{x}=4.8\) \(\&nbsp \bar{x}=3.8\) \(\&nbsp \bar{x}=4.6\)

What about interactions??

We look at the means and that indicates that an interaction might be occurring. We run this on SPSS because the calculations can become daunting!

Day Night Saturday
4 \(\&nbsp\) 2 \(\&nbsp\) 4
4 \(\&nbsp\) 2 \(\&nbsp\) 3
4 \(\&nbsp\) 2 \(\&nbsp\) 3
3 \(\&nbsp\) 1 \(\&nbsp\) 3
2 \(\&nbsp\) 0 \(\&nbsp\) 2
\(\bar{x}=3.4\) \(\&nbsp \bar{x}=1.4\) \(\&nbsp \bar{x}=3.0\)

What about interactions??

We look at the means and that indicates that an interaction might be occurring. We run this on SPSS because the calculations can become daunting!

Day Night Saturday
5 \(\&nbsp\) 5 \(\&nbsp\) 5
4 \(\&nbsp\) 4 \(\&nbsp\) 4
4 \(\&nbsp\) 4 \(\&nbsp\) 4
3 \(\&nbsp\) 3 \(\&nbsp\) 3
2 \(\&nbsp\) 2 \(\&nbsp\) 2
\(\bar{x}=3.6\) \(\&nbsp \bar{x}=3.6\) \(\&nbsp \bar{x}=3.6\)

What about interactions??

We look at the means and that indicates that an interaction might be occurring. We run this on SPSS because the calculations can become daunting!

Day Night Saturday
5 \(\&nbsp\) 3 \(\&nbsp\) 5
5 \(\&nbsp\) 2 \(\&nbsp\) 5
5 \(\&nbsp\) 2 \(\&nbsp\) 4
4 \(\&nbsp\) 1 \(\&nbsp\) 3
4 \(\&nbsp\) 0 \(\&nbsp\) 3
\(\bar{x}=4.6\) \(\&nbsp \bar{x}=1.6\) \(\&nbsp \bar{x}=4.0\)

Two-Way: Interactions and ANOVA


Key Concepts


Equations of Interest


Equations of Interest, Continued


Equations of Interest, Continued