===================================

CHI-SQUARE GOODNESS OF FIT OVERVIEW

===================================

CHI-SQUARE GOODNESS OF FIT

Compares observed categorical data from one variable to expected proportions.

NOTES

Normality does not apply to Chi-Square tests because data is only categorical.

==========

HYPOTHESES

==========

NULL HYPOTHESIS

The observed frequencies matches the expected frequencies.

ALTERNATE HYPOTHESIS

The observed frequencies do not match the expected frequencies.

………………………………………………………..

QUESTION

What are the null and alternate hypotheses for your research?

H0:

H1:

………………………………………………………..

======================

IMPORT EXCEL FILE CODE

======================

PURPOSE OF THIS CODE

Imports your Excel dataset automatically into R Studio.

You need to import your dataset every time you want to analyze your data in R Studio.

INSTALL REQUIRED PACKAGE

The package only needs to be installed once.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

install.packages(“readxl”)

LOAD THE PACKAGE

You must always reload the package you want to use.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

library(readxl)

IMPORT THE EXCEL FILE INTO R STUDIO

Download the Excel file from One Drive and save it to your desktop.

Right-click the Excel file and click “Copy as path” from the menu.

In R Studio, replace the example path below with your actual path.

Replace backslashes  with forward slashes / or double them //:

✘ WRONG “C:.xlsx”

✔ CORRECT “C:/Users/Joseph/Desktop/mydata.xlsx”

✔ CORRECT “C:\Users\Joseph\Desktop\mydata.xlsx”

Replace “dataset” with the name of your excel data (without the .xlsx)

An example of the code for this task is provided below.

You can edit the code below and remove the hashtag to use the code below.

dataset <- read_excel(“C:/Users/Joseph/Desktop/dataset.xlsx”)

=========================

VISUALLY DISPLAY THE DATA

=========================

PURPOSE

Visually display the data.

A frequency table can be used instead of a bar graph to visually display the data.

CREATE A FREQUENCY TABLE

Replace “dataset” with the name of your dataset (without the .xlsx)

Replace “Variable” with the R code name of your variable

Remove the hashtag to use the code.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

observed <- table(dataset$Variable)

VIEW YOUR FREQUENCY TABLE

View the observed frequencies.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

print(observed)

VIEW THE CATEGORY ORDER

The code for this task is provided below. Remove the hashtag below to convert the note into code.

names(observed)

===============================

CHI-SQUARE GOODNESS OF FIT CODE

===============================

PURPOSE

Determine if the null or alternate hypothesis was supported.

DEFINE EXPECTED PROPORTIONS

First, look at your methods/ research design to determine the expected proportions for each category.

Next, turn those proportions into decimals.

The expected proportions MUST be in the same order as the categories.

Percentages should be written as decimals (e.g., 0.30 = 30%) and add up to 1

An example of the code for this task is provided below.

You can edit the code below and remove the hashtag to use the code below.

expected <- c(0.30, 0.20, 0.50)

CALCULATE CHI-SQUARED RESULTS

Do NOT edit this code.

Remove the hashtags to use the code below.

chisq_gfit <- chisq.test(observed, p = expected)

print(chisq_gfit)

DETERMINE STATISTICAL SIGNIFICANCE

If results were statistically significant (p < .05), continue to the effect size section below.

If results were NOT statistically significant (p > .05), do NOT calculate the effect size.

Instead, skip to the reporting section below.

NOTE: Getting results that are not statistically significant does NOT mean you switch to a different test.

================

EFFECT SIZE CODE

================

PURPOSE

Determine how strong the similarity was between what was observed versus what was expected.

DIRECTIONS

Remove the hashtags to use the code below.

Do NOT make any other edits to the code

W <- sqrt(chisq_gfit$statistic / sum(observed))

W

DETERMINE THE SIZE OF THE EFFECT

0.00 to 0.09 = ignore

0.10 to 0.29 = small

0.30 to 0.49 = moderate

0.50+ = large

Examples:

A Cohen’s W of 0.08 indicates the similarity between the observed data and the expected data was very minimal. There was no effect.

A Cohen’s W of 0.61 indicates the similarity between the observed data and the expected data was very high. There was a large effect.

==================

SUMMARY OF RESULTS

==================

………………………………………….

QUESTION

What were the results? Write them in a paragraph below.

YOUR PARAGRPAH:

………………………………………….

DIRECTIONS

Collect the information listed below and turn it into a paragraph.

1. Name of inferential test used (Chi-Square Goodness-of-Fit Test)

2. Name of the categorical variable (car type preference)

3. Expected proportions (e.g., 20%, 20%, 60%)

4. Sample size (N)

5. State there WAS (p < .05) or was NOT (p > .05) a statistically significant difference between the expected and actual proportion.

6. Degrees of freedom (df)

7. Chi-square value (χ²)

8. EXACT p-value to three decimals. NOTE: If p > .05, just report p > .05 If p < .001, just report p < .001

Put 5, 6, 7, and 8 together:

χ²(df#, N = ##) = χ²#.##, p = .

9. What category was the most (or least) preferred

10. Effect size value to two decimals (Cohen’s w) and its size (small, medium, large)

EXAMPLE

1. Chi-Square Goodness-of-Fit Test

2. Car type preference

3. 90 participants

4. Equal distribution (33.33%, 33.33%, 33.33%)

5. Statistically significant (p < .05)

6. df = 2

7. χ²= 9.67

8. p = .008

Put 5, 6, 7, and 8 together:

χ²(2, N = 90) = 9.67, p = .008.

9. SUVs were preferred more than sedans or trucks

10. W = 0.033, medium size

PARAGRAPH FOR WORD DOCUMENT

A Chi-Square Goodness-of-Fit Test was conducted to determine whether

car type preference (Sedan, SUV, Truck)

was different from an equal distribution (33.33%, 33.33%, 33.33%)

among 90 participants.

There was a statistically significant difference in car type preferences,

χ²(2, N = 90) = 9.67, p = .008.

Participants preferred SUVs more than sedans or trucks.

The effect size was medium (Cohen’s W = 0.33).