Since there is only one categorical variable and the Chi-square test requires two categorical variables,
We now create a contingency table of the two variables Species and size with the table() function:
table(iris_data$Species, iris_data$size)
big small
setosa 1 49
versicolor 29 21
virginica 47 3
The contingency table gives the observed number of cases in each subgroup.
For instance, there is only one big setosa flower, while there are 49 small
setosa flowers in the dataset.
Create barplot to represent the data:
library(ggplot2)
ggplot(dat) +
aes(x = Species, fill = iris_data$size) +
geom_bar() +
scale_fill_hue() +
theme_minimal()

For this example, we are going to test in R if there is a relationship
between the variables Species and size. For this, the chisq.test() function is used:
test <- chisq.test(table(iris_data$Species, iris_data$size))
test
Pearson's Chi-squared test
data: table(iris_data$Species, iris_data$size)
X-squared = 86.035, df = 2, p-value < 2.2e-16
##
## Pearson's Chi-squared test
##
## data: table(dat$Species, dat$size)
## X-squared = 86.035, df = 2, p-value < 2.2e-16
Everything you need appears in this output: the title of the test, what
variables have been used, the test statistic, the degrees of freedom
and the p-value of the test.1 You can also retrieve the χ2 test statistic
and the p-value with:
test$statistic # test statistic
X-squared
86.03451
## X-squared
## 86.03451
test$p.value # p-value
[1] 2.078944e-19
## [1] 2.078944e-19
#If you need to find the expected frequencies, use test$expected.
test$observed
big small
setosa 1 49
versicolor 29 21
virginica 47 3
test$expected # test statistic
big small
setosa 25.66667 24.33333
versicolor 25.66667 24.33333
virginica 25.66667 24.33333
## X-squared
## 86.03451
#test$p.value # p-value
## [1] 2.078944e-19
From the table and from test$p.value we see that the p-value is less
than the significance level of 5%. Like any other statistical test, if
the p-value is less than the significance level, we can reject the null hypothesis.
⇒ In our context, rejecting the null hypothesis for the Chi-square test of
independence means that there is a significant relationship between the
species and the size. Therefore, knowing the value of one variable helps
to predict the value of the other variable
LS0tDQp0aXRsZTogIkFzc2lnbm1lbnQgNSBSIE5vdGVib29rIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCiMgSW5wdXQgZGF0YSBmcm9tIFIgU3R1ZGlvDQpgYGB7cn0NCmlyaXNfZGF0YSA8LSBpcmlzICMgbG9hZCB0aGUgaXJpcyBkYXRhc2V0IGFuZCByZW5hbWVkIGl0IGRhdA0KYGBgDQoNCiMgU2luY2UgdGhlcmUgaXMgb25seSBvbmUgY2F0ZWdvcmljYWwgdmFyaWFibGUgYW5kIHRoZSBDaGktc3F1YXJlIHRlc3QgcmVxdWlyZXMgdHdvIGNhdGVnb3JpY2FsIHZhcmlhYmxlcywNCiMgd2UgYWRkZWQgdGhlIHZhcmlhYmxlIHNpemUgd2hpY2ggY29ycmVzcG9uZHMgdG8gc21hbGwgaWYgdGhlIGxlbmd0aCBvZiB0aGUgcGV0YWwgaXMgc21hbGxlciB0aGFuIHRoZSBtZWRpYW4gb2YgYWxsIGZsb3dlcnMsIGJpZyBvdGhlcndpc2U6DQpgYGB7cn0NCmlyaXNfZGF0YSRzaXplIDwtIGlmZWxzZShkYXQkU2VwYWwuTGVuZ3RoIDwgbWVkaWFuKGRhdCRTZXBhbC5MZW5ndGgpLA0KICAic21hbGwiLCAiYmlnIg0KKQ0KYGBgDQoNCiMgV2Ugbm93IGNyZWF0ZSBhIGNvbnRpbmdlbmN5IHRhYmxlIG9mIHRoZSB0d28gdmFyaWFibGVzIFNwZWNpZXMgYW5kIHNpemUgd2l0aCB0aGUgdGFibGUoKSBmdW5jdGlvbjoNCg0KYGBge3J9DQp0YWJsZShpcmlzX2RhdGEkU3BlY2llcywgYykNCmBgYA0KDQojIFRoZSBjb250aW5nZW5jeSB0YWJsZSBnaXZlcyB0aGUgb2JzZXJ2ZWQgbnVtYmVyIG9mIGNhc2VzIGluIGVhY2ggc3ViZ3JvdXAuIA0KIyBGb3IgaW5zdGFuY2UsIHRoZXJlIGlzIG9ubHkgb25lIGJpZyBzZXRvc2EgZmxvd2VyLCB3aGlsZSB0aGVyZSBhcmUgNDkgc21hbGwgDQojIHNldG9zYSBmbG93ZXJzIGluIHRoZSBkYXRhc2V0Lg0KDQojIENyZWF0ZSBiYXJwbG90IHRvIHJlcHJlc2VudCB0aGUgZGF0YToNCg0KYGBge3J9DQpsaWJyYXJ5KGdncGxvdDIpDQpnZ3Bsb3QoZGF0KSArDQogYWVzKHggPSBTcGVjaWVzLCBmaWxsID0gaXJpc19kYXRhJHNpemUpICsNCiBnZW9tX2JhcigpICsNCiBzY2FsZV9maWxsX2h1ZSgpICsNCiB0aGVtZV9taW5pbWFsKCkNCmBgYA0KIyBGb3IgdGhpcyBleGFtcGxlLCB3ZSBhcmUgZ29pbmcgdG8gdGVzdCBpbiBSIGlmIHRoZXJlIGlzIGEgcmVsYXRpb25zaGlwIA0KIyBiZXR3ZWVuIHRoZSB2YXJpYWJsZXMgU3BlY2llcyBhbmQgc2l6ZS4gRm9yIHRoaXMsIHRoZSBjaGlzcS50ZXN0KCkgZnVuY3Rpb24gaXMgdXNlZDoNCg0KYGBge3J9DQp0ZXN0IDwtIGNoaXNxLnRlc3QodGFibGUoaXJpc19kYXRhJFNwZWNpZXMsIGlyaXNfZGF0YSRzaXplKSkNCnRlc3QNCiMjIA0KIyMgIFBlYXJzb24ncyBDaGktc3F1YXJlZCB0ZXN0DQojIyANCiMjIGRhdGE6ICB0YWJsZShpcmlzX2RhdGEkU3BlY2llcywgaXJpc19kYXRhJHNpemUpDQojIyBYLXNxdWFyZWQgPSA4Ni4wMzUsIGRmID0gMiwgcC12YWx1ZSA8IDIuMmUtMTYNCmBgYA0KDQojIEV2ZXJ5dGhpbmcgeW91IG5lZWQgYXBwZWFycyBpbiB0aGlzIG91dHB1dDogdGhlIHRpdGxlIG9mIHRoZSB0ZXN0LCB3aGF0IA0KIyB2YXJpYWJsZXMgaGF2ZSBiZWVuIHVzZWQsIHRoZSB0ZXN0IHN0YXRpc3RpYywgdGhlIGRlZ3JlZXMgb2YgZnJlZWRvbSANCiMgYW5kIHRoZSBwLXZhbHVlIG9mIHRoZSB0ZXN0LjEgWW91IGNhbiBhbHNvIHJldHJpZXZlIHRoZSDPhzIgdGVzdCBzdGF0aXN0aWMgDQojIGFuZCB0aGUgcC12YWx1ZSB3aXRoOg0KDQpgYGB7cn0NCnRlc3Qkc3RhdGlzdGljICMgdGVzdCBzdGF0aXN0aWMNCiMjIFgtc3F1YXJlZCANCiMjICA4Ni4wMzQ1MQ0KdGVzdCRwLnZhbHVlICMgcC12YWx1ZQ0KIyMgWzFdIDIuMDc4OTQ0ZS0xOQ0KYGBgDQojSWYgeW91IG5lZWQgdG8gZmluZCB0aGUgZXhwZWN0ZWQgZnJlcXVlbmNpZXMsIHVzZSB0ZXN0JGV4cGVjdGVkLg0KDQpgYGB7cn0NCnRlc3Qkb2JzZXJ2ZWQNCmBgYA0KDQpgYGB7cn0NCnRlc3QkZXhwZWN0ZWQgIyB0ZXN0IHN0YXRpc3RpYw0KIyMgWC1zcXVhcmVkIA0KIyMgIDg2LjAzNDUxDQojdGVzdCRwLnZhbHVlICMgcC12YWx1ZQ0KIyMgWzFdIDIuMDc4OTQ0ZS0xOQ0KYGBgDQojIEZyb20gdGhlIHRhYmxlIGFuZCBmcm9tIHRlc3QkcC52YWx1ZSB3ZSBzZWUgdGhhdCB0aGUgcC12YWx1ZSBpcyBsZXNzIA0KIyB0aGFuIHRoZSBzaWduaWZpY2FuY2UgbGV2ZWwgb2YgNSUuIExpa2UgYW55IG90aGVyIHN0YXRpc3RpY2FsIHRlc3QsIGlmIA0KIyB0aGUgcC12YWx1ZSBpcyBsZXNzIHRoYW4gdGhlIHNpZ25pZmljYW5jZSBsZXZlbCwgd2UgY2FuIHJlamVjdCB0aGUgbnVsbCBoeXBvdGhlc2lzLg0KIyDih5IgSW4gb3VyIGNvbnRleHQsIHJlamVjdGluZyB0aGUgbnVsbCBoeXBvdGhlc2lzIGZvciB0aGUgQ2hpLXNxdWFyZSB0ZXN0IG9mIA0KIyBpbmRlcGVuZGVuY2UgbWVhbnMgdGhhdCB0aGVyZSBpcyBhIHNpZ25pZmljYW50IHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSANCiMgc3BlY2llcyBhbmQgdGhlIHNpemUuIFRoZXJlZm9yZSwga25vd2luZyB0aGUgdmFsdWUgb2Ygb25lIHZhcmlhYmxlIGhlbHBzIA0KIyB0byBwcmVkaWN0IHRoZSB2YWx1ZSBvZiB0aGUgb3RoZXIgdmFyaWFibGUNCg0KDQoNCg0K