Chi Test - used when variables are categorical (not continuous)

TYPES 1) Chi Square Goodness of Fit – when we wish to compare an observed frequency to an expected one.

Suppose we know that the percentage of females in the population is 51% and we want to see if this percentage is also present in our class. In other words we wish to see if ~51% of our class are female as would be expected based on their proportion of the population.

#dataset
data <- c(20, 30)
data
## [1] 20 30
summary(data)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    20.0    22.5    25.0    25.0    27.5    30.0
#TYPE 1 - Chi Sq Goodness of Fit
res <- chisq.test(data, p = c(1/2, 1/2)) #1/2 = 50%
res
## 
##  Chi-squared test for given probabilities
## 
## data:  data
## X-squared = 2, df = 1, p-value = 0.1573
#Chi Sq Goodness of Fit Test shows that there is no significant differenc ebetween what we expected and what we see in our data

TYPE 2) Chi Square Test of Independence – when we wish to see if two groups differ in their observed frequencies across a categorical dependent variable.

Imagine we have run a marketing study where we are interested in whether one of our two conditions increases the likelihood that a person buys a product.

#datset2
AD1 <- c(25, 25) # Creating AD 1 Row
AD2 <- c(35, 15) # Creating AD 2 Row 

#Converting dataset into data.frame
data2 <- as.data.frame(rbind(AD1, AD2)) #rbind function for row bind or frame
# Labeling Columns
names(data2) = c('Bought', 'No') 
data2
##     Bought No
## AD1     25 25
## AD2     35 15
summary(data2)
##      Bought           No      
##  Min.   :25.0   Min.   :15.0  
##  1st Qu.:27.5   1st Qu.:17.5  
##  Median :30.0   Median :20.0  
##  Mean   :30.0   Mean   :20.0  
##  3rd Qu.:32.5   3rd Qu.:22.5  
##  Max.   :35.0   Max.   :25.0
#TYPE 2) Chi Square Test of Independence
chisq.test(data2) #AKA Pearson's Chi Sq Test
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  data2
## X-squared = 3.375, df = 1, p-value = 0.06619
#Chi Sq Test of Independnce shows that there is almost a significant difference between AD1 and AD2 outcomes on purchase