The table below summarizes a data set that examines the responses of a random sample of college graduates and non-graduates on the topic of oil drilling. Complete a chi-square test for these data to check where there is a statistically significant difference in responses from college graduates and non-graduates.
H0: Support for Drilling and is not different based on college graduate status Ha: Support for Drilling and is different based on college graduate status
alpha=0.05
Y<-c(154,180,104)
N<-c(132,126,131)
dat<-data.frame(Y,N)
rownames(dat)<-c("Support","Oppose","Do not know")
dat
## Y N
## Support 154 132
## Oppose 180 126
## Do not know 104 131
##Calculate expected values
nc1<-sum(Y)
nc2<-sum(N)
nr1<-sum(dat[1,])
nr2<-sum(dat[2,])
nr3<-sum(dat[3,])
n<-sum(dat)
Ye<-c(nr1*nc1/n,nr2*nc1/n,nr3*nc1/n)
Ne<-c(nr1*nc2/n,nr2*nc2/n,nr3*nc2/n)
exp<-data.frame(Ye,Ne)
exp
## Ye Ne
## 1 151.4728 134.5272
## 2 162.0653 143.9347
## 3 124.4619 110.5381
#Degrees of freedom
df<-(3-1)*(2-1)
chimatrix<-(exp-dat)^2/exp
chimatrix
## Ye Ne
## 1 0.0421645 0.04747571
## 2 1.9847161 2.23471887
## 3 3.3639993 3.78774216
chi<-sum(chimatrix)
chi
## [1] 11.46082
##Signficance
pval<- 1-pchisq(chi,df )
pval
## [1] 0.003245752
#Check against internal R function
chisq.test(dat)
##
## Pearson's Chi-squared test
##
## data: dat
## X-squared = 11.461, df = 2, p-value = 0.003246
So at since the p value is less than alpha we report that there is a difference in responses from college graduates and non-graduates.