Student Details
Karan Bhatia (S3803784)
Shakya Nandan (s3825833)
Manoj Nepal (S3788584)
Fahim Ashab (S3812295)
Problem Statement
In this assignment, we have to check if there is any statistical significant difference or if there is any association between accidents and gender(male and female) by hypothesis testing of association. There are many variables in the dataset. There are a total of 51643 individuals in which 36871 are male and 14663 are women. We have chosen the dataset gender and age_group to check if there is any association between them.
Load Packages
The following load packages are used in the assignment when finding the answer
library(readr)
library(ggplot2)
library(dplyr)
library(magrittr)
library(lattice)
library(readxl)
library(RColorBrewer)
Data analysis
We have compared proportions of data of male and female who were involoved in accidents.
table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender)%>% prop.table(margin = 2)
Female Male
0_to_16 0.10543545 0.06633940
17_to_25 0.21428084 0.28217298
26_to_39 0.18024961 0.25583792
40_to_64 0.25485917 0.25233924
65_to_74 0.10441247 0.06427816
75_or_older 0.14076246 0.07903230
library(RColorBrewer)
table <- table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender) %>% prop.table(margin = 2)
When we compare proportion of gender involvement in accidents group between different group females of grop 40_to_64 get involve in accidents with 0.25485917 in comparison to other groups.In addition, when we compare accidents involvement in male , 17_to_25 age groups involved in accidents with proportion of 0.28217298 in comparison with other age groups.
Descriptive Statistics and Visualisation
The statistics of the given data is found by using the summary function of Rstudio. It is found separately both male and female.

Hypothesis Testing
Compare the empirical distribution of selected body measurement to a normal distribution separately in men and in women. You need to do this visually by plotting the histogram with normal distribution overlay. The statistical hypotheses for this Chi-square test of association can be written as follows:
H0: There is no association in the population between the gender and age_group (independence)
HA: There is an association in the population between the gender and age_group (dependence)
ch2<-chisq.test(table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender))
ch2
Pearson's Chi-squared test
data: table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender)
X-squared = 1284.1, df = 5, p-value < 2.2e-16
As this p-value was less than the 0.05 level of significance, H0 was rejected. There was a statistically significant association between age group and the gender at involving in accidents.
calculation
Here are the tables of observed and expected values from which chi-square value is calculated:
ch2$observed
Female Male
0_to_16 1546 2446
17_to_25 3142 10404
26_to_39 2643 9433
40_to_64 3737 9304
65_to_74 1531 2370
75_or_older 2064 2914
ch2$expected
Female Male
0_to_16 1135.846 2856.154
17_to_25 3854.252 9691.748
26_to_39 3435.992 8640.008
40_to_64 3710.564 9330.436
65_to_74 1109.954 2791.046
75_or_older 1416.393 3561.607
qchisq(p = .95,df = 5)
[1] 11.0705
The critical value was found to be 11.07. We reject H0 when χ2>χ2crit.
pchisq(q = 1284.1,df = 5,lower.tail = FALSE)
[1] 1.778112e-275
ch2$p.value
[1] 1.759893e-275
As this p-value was less than the 0.05 level of significance, H0 was rejected. There was a statistically significant association between age group and the gender for involvement in accidents.
#Hand Calculation Chi-square value = (((1546-1135.846)^2)/1135.846) + (((2446-2856.154)^2)/2856.14) + (((3142-3854.252)^2)/2854.252) + (((10404-9691.748)^2)/9691.748) + (((2643-3435.992)^2)/3435.992) + (((9433-8640.008)^2)/8640.008 + (((3737-3710.534)^2)/3710.534) + (((9304-9330.436)^2)/9330.436) + (((1531-1109.954)^2)/1109.954) + (((2370-2791.046)^2)/2791.046 = 870.99538
Interpretation
From the Rstudio results, we observe that the chi-square critical value is 11.0705. Our calculated Chi-sqaure value is 870.99538 , which is more than the chi-square critical value ,thus we can happily reject null hypothesis and say there is an association of age with gender.Also , for further proof p value is less than 0.05 which is also a proof of association.Also,we can see that the bars vary in height.If there was no association the bars wouldn’t vary,hence association of gender with age is proved.
LS0tDQp0aXRsZTogIk1BVEgxMzI0IEFwcGxpZWQgQW5hbHl0aWNzIEFzc2lnbm1lbnQgMiINCnN1YnRpdGxlOiAiRmluYWwgUHJvamVjdCINCmRhdGU6ICIgMTEgT2N0b2JlciwgMjAyMCINCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KLS0tDQoNCiMjIFN0dWRlbnQgRGV0YWlscw0KDQoNCiMjIyMgS2FyYW4gQmhhdGlhIChTMzgwMzc4NCkNCiMjIyMgU2hha3lhIE5hbmRhbiAoczM4MjU4MzMpDQojIyMjIE1hbm9qIE5lcGFsIChTMzc4ODU4NCkNCiMjIyMgRmFoaW0gQXNoYWIgKFMzODEyMjk1KQ0KDQoNCiMjIFByb2JsZW0gU3RhdGVtZW50DQoNCkluIHRoaXMgYXNzaWdubWVudCwgd2UgaGF2ZSB0byBjaGVjayBpZiB0aGVyZSBpcyBhbnkgc3RhdGlzdGljYWwgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBvciBpZiB0aGVyZSBpcyBhbnkgYXNzb2NpYXRpb24gYmV0d2VlbiBhY2NpZGVudHMgYW5kIGdlbmRlcihtYWxlIGFuZCBmZW1hbGUpIGJ5IGh5cG90aGVzaXMgdGVzdGluZyBvZiBhc3NvY2lhdGlvbi4gIFRoZXJlIGFyZSBtYW55IHZhcmlhYmxlcyBpbiB0aGUgZGF0YXNldC4gVGhlcmUgYXJlIGEgdG90YWwgb2YgNTE2NDMgaW5kaXZpZHVhbHMgaW4gd2hpY2ggMzY4NzEgYXJlIG1hbGUgYW5kIDE0NjYzIGFyZSB3b21lbi4gV2UgaGF2ZSBjaG9zZW4gdGhlIGRhdGFzZXQgZ2VuZGVyIGFuZCBhZ2VfZ3JvdXAgdG8gY2hlY2sgaWYgdGhlcmUgaXMgYW55IGFzc29jaWF0aW9uIGJldHdlZW4gdGhlbS4gDQoNCiMjIExvYWQgUGFja2FnZXMNClRoZSBmb2xsb3dpbmcgbG9hZCBwYWNrYWdlcyBhcmUgdXNlZCBpbiB0aGUgYXNzaWdubWVudCB3aGVuIGZpbmRpbmcgdGhlIGFuc3dlciANCmBgYHtyfQ0KbGlicmFyeShyZWFkcikNCmxpYnJhcnkoZ2dwbG90MikNCmxpYnJhcnkoZHBseXIpDQpsaWJyYXJ5KG1hZ3JpdHRyKQ0KbGlicmFyeShsYXR0aWNlKQ0KbGlicmFyeShyZWFkeGwpDQpsaWJyYXJ5KFJDb2xvckJyZXdlcikNCg0KYGBgDQoNCiMjIERhdGENCldlIGhhdmUgdXNlZCAiaHR0cHM6Ly9kYXRhLmdvdi5hdS9kYXRhL2RhdGFzZXQvYXVzdHJhbGlhbi1yb2FkLWRlYXRocyAiIHRvIGNvbGxlY3QgZGF0YSBmb3Igb3VyIGFzc2lnbm1lbnQuDQoNCkJlbG93IGlzIHRoZSBjb2RlIGZvciBpbXBvcnRpbmcgdGhlIGRhdGEuIC4NCg0KDQpgYGB7cn0NCmFyZGRfZmF0YWxpdGllc18gPC0gcmVhZF9jc3YoImFyZGRfZmF0YWxpdGllcyAuY3N2IikNClZpZXcoYXJkZF9mYXRhbGl0aWVzXykNCg0KDQoNCmBgYA0KDQojIyBEYXRhIGFuYWx5c2lzIA0KV2UgaGF2ZSBjb21wYXJlZCBwcm9wb3J0aW9ucyBvZiBkYXRhIG9mIG1hbGUgYW5kIGZlbWFsZSB3aG8gd2VyZSBpbnZvbG92ZWQgaW4gYWNjaWRlbnRzLiANCmBgYHtyfQ0KIyB0YWJsZQ0KdGFibGUoYXJkZF9mYXRhbGl0aWVzJGBBZ2UgR3JvdXBgLCBhcmRkX2ZhdGFsaXRpZXMkR2VuZGVyKSU+JSBwcm9wLnRhYmxlKG1hcmdpbiA9IDIpDQoNCnRhYmxlIDwtIHRhYmxlKGFyZGRfZmF0YWxpdGllcyRgQWdlIEdyb3VwYCwgYXJkZF9mYXRhbGl0aWVzJEdlbmRlcikgJT4lIHByb3AudGFibGUobWFyZ2luID0gMikNCg0KYGBgDQpXaGVuIHdlIGNvbXBhcmUgcHJvcG9ydGlvbiBvZiBnZW5kZXIgaW52b2x2ZW1lbnQgaW4gYWNjaWRlbnRzIGdyb3VwIGJldHdlZW4gZGlmZmVyZW50IGdyb3VwIGZlbWFsZXMgb2YgZ3JvcCA0MF90b182NCBnZXQgaW52b2x2ZSBpbiBhY2NpZGVudHMgd2l0aCAwLjI1NDg1OTE3IGluIGNvbXBhcmlzb24gdG8gb3RoZXIgZ3JvdXBzLkluIGFkZGl0aW9uLCB3aGVuIHdlIGNvbXBhcmUgYWNjaWRlbnRzIGludm9sdmVtZW50IGluIG1hbGUgLCAxN190b18yNSBhZ2UgZ3JvdXBzIGludm9sdmVkIGluIGFjY2lkZW50cyB3aXRoIHByb3BvcnRpb24gb2YgMC4yODIxNzI5OCBpbiBjb21wYXJpc29uIHdpdGggb3RoZXIgYWdlIGdyb3Vwcy4NCg0KIyMgRGVzY3JpcHRpdmUgU3RhdGlzdGljcyBhbmQgVmlzdWFsaXNhdGlvbg0KDQpUaGUgc3RhdGlzdGljcyBvZiB0aGUgZ2l2ZW4gZGF0YSBpcyBmb3VuZCBieSB1c2luZyB0aGUgc3VtbWFyeSBmdW5jdGlvbiBvZiBSc3R1ZGlvLiBJdCBpcyBmb3VuZCBzZXBhcmF0ZWx5IGJvdGggbWFsZSBhbmQgZmVtYWxlLg0KDQpgYGB7cn0NCmJhcnBsb3QodGFibGUseWxhYj0iUHJvcG9ydGlvbiBXaXRoaW4gR3JvdXAiLA0KICAgICAgICB5bGltPWMoMCwuNCksbGVnZW5kPXJvd25hbWVzKHRhYmxlKSxiZXNpZGU9VFJVRSwNCiAgICAgICAgYXJncy5sZWdlbmQ9Yyh4ID0gInRvcCIsaG9yaXo9VFJVRSx0aXRsZT0iQWdlIENhdGVnb3J5IiksDQogICAgICAgIHhsYWI9IkFnZSBDYXRlZ29yeSIsIGNvbCA9IGJyZXdlci5wYWwoNSwgbmFtZSA9ICJSZEJ1IikpDQoNCg0KYGBgDQoNCg0KIyMgSHlwb3RoZXNpcyBUZXN0aW5nDQoNCkNvbXBhcmUgdGhlIGVtcGlyaWNhbCBkaXN0cmlidXRpb24gb2Ygc2VsZWN0ZWQgYm9keSBtZWFzdXJlbWVudCB0byBhIG5vcm1hbCBkaXN0cmlidXRpb24gc2VwYXJhdGVseSBpbiBtZW4gYW5kIGluIHdvbWVuLiBZb3UgbmVlZCB0byBkbyB0aGlzIHZpc3VhbGx5IGJ5IHBsb3R0aW5nIHRoZSBoaXN0b2dyYW0gd2l0aCBub3JtYWwgZGlzdHJpYnV0aW9uIG92ZXJsYXkuDQpUaGUgc3RhdGlzdGljYWwgaHlwb3RoZXNlcyBmb3IgdGhpcyBDaGktc3F1YXJlIHRlc3Qgb2YgYXNzb2NpYXRpb24gY2FuIGJlIHdyaXR0ZW4gYXMgZm9sbG93czoNCg0KSDA6IFRoZXJlIGlzIG5vIGFzc29jaWF0aW9uIGluIHRoZSBwb3B1bGF0aW9uIGJldHdlZW4gdGhlIGdlbmRlciBhbmQgYWdlX2dyb3VwIChpbmRlcGVuZGVuY2UpDQoNCkhBOiBUaGVyZSBpcyBhbiBhc3NvY2lhdGlvbiBpbiB0aGUgcG9wdWxhdGlvbiBiZXR3ZWVuIHRoZSBnZW5kZXIgYW5kIGFnZV9ncm91cCAoZGVwZW5kZW5jZSkgDQoNCmBgYHtyfQ0KI1IgY29kZSB0byBzZWUgQ2hpIHNxdWFyZSB0ZXN0IG9mIGFzc29jaWF0aW9uDQpjaDI8LWNoaXNxLnRlc3QodGFibGUoYXJkZF9mYXRhbGl0aWVzJGBBZ2UgR3JvdXBgLCBhcmRkX2ZhdGFsaXRpZXMkR2VuZGVyKSkNCmNoMg0KDQpgYGANCkFzIHRoaXMgcC12YWx1ZSB3YXMgbGVzcyB0aGFuIHRoZSAwLjA1IGxldmVsIG9mIHNpZ25pZmljYW5jZSwgSDAgd2FzIHJlamVjdGVkLiBUaGVyZSB3YXMgYSBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50IGFzc29jaWF0aW9uIGJldHdlZW4gYWdlIGdyb3VwIGFuZCB0aGUgZ2VuZGVyIGF0IGludm9sdmluZyBpbiBhY2NpZGVudHMuIA0KDQoNCg0KDQojIyBjYWxjdWxhdGlvbg0KSGVyZSBhcmUgdGhlIHRhYmxlcyBvZiBvYnNlcnZlZCBhbmQgZXhwZWN0ZWQgdmFsdWVzIGZyb20gd2hpY2ggY2hpLXNxdWFyZSB2YWx1ZSBpcyBjYWxjdWxhdGVkOg0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSB0YWJsZSBvZiBvYnNlcnZlZCB2YWx1ZXM6DQpjaDIkb2JzZXJ2ZWQNCmBgYA0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSB0YWJsZSBvZiBleHBlY3RlZCB2YWx1ZXM6DQpjaDIkZXhwZWN0ZWQNCmBgYA0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSBwIHZhbHVlIG9mIGNoaS1zcXVhcmUgdGVzdDoNCnFjaGlzcShwID0gLjk1LGRmID0gNSkNCg0KYGBgDQpUaGUgY3JpdGljYWwgdmFsdWUgd2FzIGZvdW5kIHRvIGJlIDExLjA3LiBXZSByZWplY3QgSDAgd2hlbiDPhzI+z4cyY3JpdC4gDQoNCmBgYHtyfQ0KcGNoaXNxKHEgPSAxMjg0LjEsZGYgPSA1LGxvd2VyLnRhaWwgPSBGQUxTRSkNCmBgYA0KYGBge3J9DQpjaDIkcC52YWx1ZQ0KYGBgDQoNCg0KQXMgdGhpcyBwLXZhbHVlIHdhcyBsZXNzIHRoYW4gdGhlIDAuMDUgbGV2ZWwgb2Ygc2lnbmlmaWNhbmNlLCBIMCB3YXMgcmVqZWN0ZWQuIFRoZXJlIHdhcyBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgYXNzb2NpYXRpb24gYmV0d2VlbiBhZ2UgZ3JvdXAgYW5kIHRoZSBnZW5kZXIgZm9yIGludm9sdmVtZW50IGluIGFjY2lkZW50cy4NCg0KI0hhbmQgQ2FsY3VsYXRpb24gDQpDaGktc3F1YXJlIHZhbHVlID0gKCgoMTU0Ni0xMTM1Ljg0NileMikvMTEzNS44NDYpICsgKCgoMjQ0Ni0yODU2LjE1NCleMikvMjg1Ni4xNCkgKyAoKCgzMTQyLTM4NTQuMjUyKV4yKS8yODU0LjI1MikgKyAoKCgxMDQwNC05NjkxLjc0OCleMikvOTY5MS43NDgpICsgKCgoMjY0My0zNDM1Ljk5MileMikvMzQzNS45OTIpICsgKCgoOTQzMy04NjQwLjAwOCleMikvODY0MC4wMDggKyAoKCgzNzM3LTM3MTAuNTM0KV4yKS8zNzEwLjUzNCkgKyAoKCg5MzA0LTkzMzAuNDM2KV4yKS85MzMwLjQzNikgKyAoKCgxNTMxLTExMDkuOTU0KV4yKS8xMTA5Ljk1NCkgKyAoKCgyMzcwLTI3OTEuMDQ2KV4yKS8yNzkxLjA0Ng0KPSA4NzAuOTk1MzgNCg0KIyMgSW50ZXJwcmV0YXRpb24NCkZyb20gdGhlIFJzdHVkaW8gcmVzdWx0cywgd2Ugb2JzZXJ2ZSB0aGF0IHRoZSBjaGktc3F1YXJlIGNyaXRpY2FsIHZhbHVlIGlzIDExLjA3MDUuIE91ciBjYWxjdWxhdGVkIENoaS1zcWF1cmUgdmFsdWUgaXMgODcwLjk5NTM4ICwgd2hpY2ggaXMgbW9yZSB0aGFuIHRoZSBjaGktc3F1YXJlIGNyaXRpY2FsIHZhbHVlICx0aHVzIHdlIGNhbiBoYXBwaWx5IHJlamVjdCBudWxsIGh5cG90aGVzaXMgYW5kIHNheSB0aGVyZSBpcyBhbiBhc3NvY2lhdGlvbiBvZiBhZ2Ugd2l0aCBnZW5kZXIuQWxzbyAsIGZvciBmdXJ0aGVyIHByb29mIHAgdmFsdWUgaXMgbGVzcyB0aGFuIDAuMDUgd2hpY2ggaXMgYWxzbyBhIHByb29mIG9mIGFzc29jaWF0aW9uLkFsc28sd2UgY2FuIHNlZSB0aGF0IHRoZSBiYXJzIHZhcnkgaW4gaGVpZ2h0LklmIHRoZXJlIHdhcyBubyBhc3NvY2lhdGlvbiB0aGUgYmFycyB3b3VsZG7igJl0IHZhcnksaGVuY2UgYXNzb2NpYXRpb24gIG9mIGdlbmRlciB3aXRoIGFnZSBpcyBwcm92ZWQuDQoNCiMjIFJlZnJlbmNlcyAgDQoxLiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvTUFTUy92ZXJzaW9ucy83LjMtNTIvdG9waWNzL2ZpdGRpc3RyDQoyLiBodHRwczovL2RhdGEuZ292LmF1L2RhdGEvZGF0YXNldC9hdXN0cmFsaWFuLXJvYWQtZGVhdGhzLWRhdGFiYXNlDQozLiBodHRwczovL3d3dy5yLXR1dG9yLmNvbS9lbGVtZW50YXJ5LXN0YXRpc3RpY3MvaHlwb3RoZXNpcy10ZXN0aW5n