Student Details

Karan Bhatia (S3803784)

Shakya Nandan (s3825833)

Manoj Nepal (S3788584)

Fahim Ashab (S3812295)

Problem Statement

In this assignment, we have to check if there is any statistical significant difference or if there is any association between accidents and gender(male and female) by hypothesis testing of association. There are many variables in the dataset. There are a total of 51643 individuals in which 36871 are male and 14663 are women. We have chosen the dataset gender and age_group to check if there is any association between them.

Load Packages

The following load packages are used in the assignment when finding the answer

library(readr)
library(ggplot2)
library(dplyr)
library(magrittr)
library(lattice)
library(readxl)
library(RColorBrewer)

Data

We have used “https://data.gov.au/data/dataset/australian-road-deaths” to collect data for our assignment.

Below is the code for importing the data. .

ardd_fatalities_ <- read_csv("ardd_fatalities .csv")
View(ardd_fatalities_)

Data analysis

We have compared proportions of data of male and female who were involoved in accidents.

table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender)%>% prop.table(margin = 2)
             
                  Female       Male
  0_to_16     0.10543545 0.06633940
  17_to_25    0.21428084 0.28217298
  26_to_39    0.18024961 0.25583792
  40_to_64    0.25485917 0.25233924
  65_to_74    0.10441247 0.06427816
  75_or_older 0.14076246 0.07903230
library(RColorBrewer)
table <- table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender) %>% prop.table(margin = 2)

When we compare proportion of gender involvement in accidents group between different group females of grop 40_to_64 get involve in accidents with 0.25485917 in comparison to other groups.In addition, when we compare accidents involvement in male , 17_to_25 age groups involved in accidents with proportion of 0.28217298 in comparison with other age groups.

Descriptive Statistics and Visualisation

The statistics of the given data is found by using the summary function of Rstudio. It is found separately both male and female.

Hypothesis Testing

Compare the empirical distribution of selected body measurement to a normal distribution separately in men and in women. You need to do this visually by plotting the histogram with normal distribution overlay. The statistical hypotheses for this Chi-square test of association can be written as follows:

H0: There is no association in the population between the gender and age_group (independence)

HA: There is an association in the population between the gender and age_group (dependence)

ch2<-chisq.test(table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender))
ch2

    Pearson's Chi-squared test

data:  table(ardd_fatalities$`Age Group`, ardd_fatalities$Gender)
X-squared = 1284.1, df = 5, p-value < 2.2e-16

As this p-value was less than the 0.05 level of significance, H0 was rejected. There was a statistically significant association between age group and the gender at involving in accidents.

calculation

Here are the tables of observed and expected values from which chi-square value is calculated:

ch2$observed
             
              Female  Male
  0_to_16       1546  2446
  17_to_25      3142 10404
  26_to_39      2643  9433
  40_to_64      3737  9304
  65_to_74      1531  2370
  75_or_older   2064  2914
ch2$expected
             
                Female     Male
  0_to_16     1135.846 2856.154
  17_to_25    3854.252 9691.748
  26_to_39    3435.992 8640.008
  40_to_64    3710.564 9330.436
  65_to_74    1109.954 2791.046
  75_or_older 1416.393 3561.607
qchisq(p = .95,df = 5)
[1] 11.0705

The critical value was found to be 11.07. We reject H0 when χ2>χ2crit.

pchisq(q = 1284.1,df = 5,lower.tail = FALSE)
[1] 1.778112e-275
ch2$p.value
[1] 1.759893e-275

As this p-value was less than the 0.05 level of significance, H0 was rejected. There was a statistically significant association between age group and the gender for involvement in accidents.

#Hand Calculation Chi-square value = (((1546-1135.846)^2)/1135.846) + (((2446-2856.154)^2)/2856.14) + (((3142-3854.252)^2)/2854.252) + (((10404-9691.748)^2)/9691.748) + (((2643-3435.992)^2)/3435.992) + (((9433-8640.008)^2)/8640.008 + (((3737-3710.534)^2)/3710.534) + (((9304-9330.436)^2)/9330.436) + (((1531-1109.954)^2)/1109.954) + (((2370-2791.046)^2)/2791.046 = 870.99538

Interpretation

From the Rstudio results, we observe that the chi-square critical value is 11.0705. Our calculated Chi-sqaure value is 870.99538 , which is more than the chi-square critical value ,thus we can happily reject null hypothesis and say there is an association of age with gender.Also , for further proof p value is less than 0.05 which is also a proof of association.Also,we can see that the bars vary in height.If there was no association the bars wouldn’t vary,hence association of gender with age is proved.

Refrences

  1. https://www.rdocumentation.org/packages/MASS/versions/7.3-52/topics/fitdistr
  2. https://data.gov.au/data/dataset/australian-road-deaths-database
  3. https://www.r-tutor.com/elementary-statistics/hypothesis-testing
LS0tDQp0aXRsZTogIk1BVEgxMzI0IEFwcGxpZWQgQW5hbHl0aWNzIEFzc2lnbm1lbnQgMiINCnN1YnRpdGxlOiAiRmluYWwgUHJvamVjdCINCmRhdGU6ICIgMTEgT2N0b2JlciwgMjAyMCINCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KLS0tDQoNCiMjIFN0dWRlbnQgRGV0YWlscw0KDQoNCiMjIyMgS2FyYW4gQmhhdGlhIChTMzgwMzc4NCkNCiMjIyMgU2hha3lhIE5hbmRhbiAoczM4MjU4MzMpDQojIyMjIE1hbm9qIE5lcGFsIChTMzc4ODU4NCkNCiMjIyMgRmFoaW0gQXNoYWIgKFMzODEyMjk1KQ0KDQoNCiMjIFByb2JsZW0gU3RhdGVtZW50DQoNCkluIHRoaXMgYXNzaWdubWVudCwgd2UgaGF2ZSB0byBjaGVjayBpZiB0aGVyZSBpcyBhbnkgc3RhdGlzdGljYWwgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBvciBpZiB0aGVyZSBpcyBhbnkgYXNzb2NpYXRpb24gYmV0d2VlbiBhY2NpZGVudHMgYW5kIGdlbmRlcihtYWxlIGFuZCBmZW1hbGUpIGJ5IGh5cG90aGVzaXMgdGVzdGluZyBvZiBhc3NvY2lhdGlvbi4gIFRoZXJlIGFyZSBtYW55IHZhcmlhYmxlcyBpbiB0aGUgZGF0YXNldC4gVGhlcmUgYXJlIGEgdG90YWwgb2YgNTE2NDMgaW5kaXZpZHVhbHMgaW4gd2hpY2ggMzY4NzEgYXJlIG1hbGUgYW5kIDE0NjYzIGFyZSB3b21lbi4gV2UgaGF2ZSBjaG9zZW4gdGhlIGRhdGFzZXQgZ2VuZGVyIGFuZCBhZ2VfZ3JvdXAgdG8gY2hlY2sgaWYgdGhlcmUgaXMgYW55IGFzc29jaWF0aW9uIGJldHdlZW4gdGhlbS4gDQoNCiMjIExvYWQgUGFja2FnZXMNClRoZSBmb2xsb3dpbmcgbG9hZCBwYWNrYWdlcyBhcmUgdXNlZCBpbiB0aGUgYXNzaWdubWVudCB3aGVuIGZpbmRpbmcgdGhlIGFuc3dlciANCmBgYHtyfQ0KbGlicmFyeShyZWFkcikNCmxpYnJhcnkoZ2dwbG90MikNCmxpYnJhcnkoZHBseXIpDQpsaWJyYXJ5KG1hZ3JpdHRyKQ0KbGlicmFyeShsYXR0aWNlKQ0KbGlicmFyeShyZWFkeGwpDQpsaWJyYXJ5KFJDb2xvckJyZXdlcikNCg0KYGBgDQoNCiMjIERhdGENCldlIGhhdmUgdXNlZCAiaHR0cHM6Ly9kYXRhLmdvdi5hdS9kYXRhL2RhdGFzZXQvYXVzdHJhbGlhbi1yb2FkLWRlYXRocyAiIHRvIGNvbGxlY3QgZGF0YSBmb3Igb3VyIGFzc2lnbm1lbnQuDQoNCkJlbG93IGlzIHRoZSBjb2RlIGZvciBpbXBvcnRpbmcgdGhlIGRhdGEuIC4NCg0KDQpgYGB7cn0NCmFyZGRfZmF0YWxpdGllc18gPC0gcmVhZF9jc3YoImFyZGRfZmF0YWxpdGllcyAuY3N2IikNClZpZXcoYXJkZF9mYXRhbGl0aWVzXykNCg0KDQoNCmBgYA0KDQojIyBEYXRhIGFuYWx5c2lzIA0KV2UgaGF2ZSBjb21wYXJlZCBwcm9wb3J0aW9ucyBvZiBkYXRhIG9mIG1hbGUgYW5kIGZlbWFsZSB3aG8gd2VyZSBpbnZvbG92ZWQgaW4gYWNjaWRlbnRzLiANCmBgYHtyfQ0KIyB0YWJsZQ0KdGFibGUoYXJkZF9mYXRhbGl0aWVzJGBBZ2UgR3JvdXBgLCBhcmRkX2ZhdGFsaXRpZXMkR2VuZGVyKSU+JSBwcm9wLnRhYmxlKG1hcmdpbiA9IDIpDQoNCnRhYmxlIDwtIHRhYmxlKGFyZGRfZmF0YWxpdGllcyRgQWdlIEdyb3VwYCwgYXJkZF9mYXRhbGl0aWVzJEdlbmRlcikgJT4lIHByb3AudGFibGUobWFyZ2luID0gMikNCg0KYGBgDQpXaGVuIHdlIGNvbXBhcmUgcHJvcG9ydGlvbiBvZiBnZW5kZXIgaW52b2x2ZW1lbnQgaW4gYWNjaWRlbnRzIGdyb3VwIGJldHdlZW4gZGlmZmVyZW50IGdyb3VwIGZlbWFsZXMgb2YgZ3JvcCA0MF90b182NCBnZXQgaW52b2x2ZSBpbiBhY2NpZGVudHMgd2l0aCAwLjI1NDg1OTE3IGluIGNvbXBhcmlzb24gdG8gb3RoZXIgZ3JvdXBzLkluIGFkZGl0aW9uLCB3aGVuIHdlIGNvbXBhcmUgYWNjaWRlbnRzIGludm9sdmVtZW50IGluIG1hbGUgLCAxN190b18yNSBhZ2UgZ3JvdXBzIGludm9sdmVkIGluIGFjY2lkZW50cyB3aXRoIHByb3BvcnRpb24gb2YgMC4yODIxNzI5OCBpbiBjb21wYXJpc29uIHdpdGggb3RoZXIgYWdlIGdyb3Vwcy4NCg0KIyMgRGVzY3JpcHRpdmUgU3RhdGlzdGljcyBhbmQgVmlzdWFsaXNhdGlvbg0KDQpUaGUgc3RhdGlzdGljcyBvZiB0aGUgZ2l2ZW4gZGF0YSBpcyBmb3VuZCBieSB1c2luZyB0aGUgc3VtbWFyeSBmdW5jdGlvbiBvZiBSc3R1ZGlvLiBJdCBpcyBmb3VuZCBzZXBhcmF0ZWx5IGJvdGggbWFsZSBhbmQgZmVtYWxlLg0KDQpgYGB7cn0NCmJhcnBsb3QodGFibGUseWxhYj0iUHJvcG9ydGlvbiBXaXRoaW4gR3JvdXAiLA0KICAgICAgICB5bGltPWMoMCwuNCksbGVnZW5kPXJvd25hbWVzKHRhYmxlKSxiZXNpZGU9VFJVRSwNCiAgICAgICAgYXJncy5sZWdlbmQ9Yyh4ID0gInRvcCIsaG9yaXo9VFJVRSx0aXRsZT0iQWdlIENhdGVnb3J5IiksDQogICAgICAgIHhsYWI9IkFnZSBDYXRlZ29yeSIsIGNvbCA9IGJyZXdlci5wYWwoNSwgbmFtZSA9ICJSZEJ1IikpDQoNCg0KYGBgDQoNCg0KIyMgSHlwb3RoZXNpcyBUZXN0aW5nDQoNCkNvbXBhcmUgdGhlIGVtcGlyaWNhbCBkaXN0cmlidXRpb24gb2Ygc2VsZWN0ZWQgYm9keSBtZWFzdXJlbWVudCB0byBhIG5vcm1hbCBkaXN0cmlidXRpb24gc2VwYXJhdGVseSBpbiBtZW4gYW5kIGluIHdvbWVuLiBZb3UgbmVlZCB0byBkbyB0aGlzIHZpc3VhbGx5IGJ5IHBsb3R0aW5nIHRoZSBoaXN0b2dyYW0gd2l0aCBub3JtYWwgZGlzdHJpYnV0aW9uIG92ZXJsYXkuDQpUaGUgc3RhdGlzdGljYWwgaHlwb3RoZXNlcyBmb3IgdGhpcyBDaGktc3F1YXJlIHRlc3Qgb2YgYXNzb2NpYXRpb24gY2FuIGJlIHdyaXR0ZW4gYXMgZm9sbG93czoNCg0KSDA6IFRoZXJlIGlzIG5vIGFzc29jaWF0aW9uIGluIHRoZSBwb3B1bGF0aW9uIGJldHdlZW4gdGhlIGdlbmRlciBhbmQgYWdlX2dyb3VwIChpbmRlcGVuZGVuY2UpDQoNCkhBOiBUaGVyZSBpcyBhbiBhc3NvY2lhdGlvbiBpbiB0aGUgcG9wdWxhdGlvbiBiZXR3ZWVuIHRoZSBnZW5kZXIgYW5kIGFnZV9ncm91cCAoZGVwZW5kZW5jZSkgDQoNCmBgYHtyfQ0KI1IgY29kZSB0byBzZWUgQ2hpIHNxdWFyZSB0ZXN0IG9mIGFzc29jaWF0aW9uDQpjaDI8LWNoaXNxLnRlc3QodGFibGUoYXJkZF9mYXRhbGl0aWVzJGBBZ2UgR3JvdXBgLCBhcmRkX2ZhdGFsaXRpZXMkR2VuZGVyKSkNCmNoMg0KDQpgYGANCkFzIHRoaXMgcC12YWx1ZSB3YXMgbGVzcyB0aGFuIHRoZSAwLjA1IGxldmVsIG9mIHNpZ25pZmljYW5jZSwgSDAgd2FzIHJlamVjdGVkLiBUaGVyZSB3YXMgYSBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50IGFzc29jaWF0aW9uIGJldHdlZW4gYWdlIGdyb3VwIGFuZCB0aGUgZ2VuZGVyIGF0IGludm9sdmluZyBpbiBhY2NpZGVudHMuIA0KDQoNCg0KDQojIyBjYWxjdWxhdGlvbg0KSGVyZSBhcmUgdGhlIHRhYmxlcyBvZiBvYnNlcnZlZCBhbmQgZXhwZWN0ZWQgdmFsdWVzIGZyb20gd2hpY2ggY2hpLXNxdWFyZSB2YWx1ZSBpcyBjYWxjdWxhdGVkOg0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSB0YWJsZSBvZiBvYnNlcnZlZCB2YWx1ZXM6DQpjaDIkb2JzZXJ2ZWQNCmBgYA0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSB0YWJsZSBvZiBleHBlY3RlZCB2YWx1ZXM6DQpjaDIkZXhwZWN0ZWQNCmBgYA0KYGBge3J9DQojUiBjb2RlIHRvIHNlZSBwIHZhbHVlIG9mIGNoaS1zcXVhcmUgdGVzdDoNCnFjaGlzcShwID0gLjk1LGRmID0gNSkNCg0KYGBgDQpUaGUgY3JpdGljYWwgdmFsdWUgd2FzIGZvdW5kIHRvIGJlIDExLjA3LiBXZSByZWplY3QgSDAgd2hlbiDPhzI+z4cyY3JpdC4gDQoNCmBgYHtyfQ0KcGNoaXNxKHEgPSAxMjg0LjEsZGYgPSA1LGxvd2VyLnRhaWwgPSBGQUxTRSkNCmBgYA0KYGBge3J9DQpjaDIkcC52YWx1ZQ0KYGBgDQoNCg0KQXMgdGhpcyBwLXZhbHVlIHdhcyBsZXNzIHRoYW4gdGhlIDAuMDUgbGV2ZWwgb2Ygc2lnbmlmaWNhbmNlLCBIMCB3YXMgcmVqZWN0ZWQuIFRoZXJlIHdhcyBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgYXNzb2NpYXRpb24gYmV0d2VlbiBhZ2UgZ3JvdXAgYW5kIHRoZSBnZW5kZXIgZm9yIGludm9sdmVtZW50IGluIGFjY2lkZW50cy4NCg0KI0hhbmQgQ2FsY3VsYXRpb24gDQpDaGktc3F1YXJlIHZhbHVlID0gKCgoMTU0Ni0xMTM1Ljg0NileMikvMTEzNS44NDYpICsgKCgoMjQ0Ni0yODU2LjE1NCleMikvMjg1Ni4xNCkgKyAoKCgzMTQyLTM4NTQuMjUyKV4yKS8yODU0LjI1MikgKyAoKCgxMDQwNC05NjkxLjc0OCleMikvOTY5MS43NDgpICsgKCgoMjY0My0zNDM1Ljk5MileMikvMzQzNS45OTIpICsgKCgoOTQzMy04NjQwLjAwOCleMikvODY0MC4wMDggKyAoKCgzNzM3LTM3MTAuNTM0KV4yKS8zNzEwLjUzNCkgKyAoKCg5MzA0LTkzMzAuNDM2KV4yKS85MzMwLjQzNikgKyAoKCgxNTMxLTExMDkuOTU0KV4yKS8xMTA5Ljk1NCkgKyAoKCgyMzcwLTI3OTEuMDQ2KV4yKS8yNzkxLjA0Ng0KPSA4NzAuOTk1MzgNCg0KIyMgSW50ZXJwcmV0YXRpb24NCkZyb20gdGhlIFJzdHVkaW8gcmVzdWx0cywgd2Ugb2JzZXJ2ZSB0aGF0IHRoZSBjaGktc3F1YXJlIGNyaXRpY2FsIHZhbHVlIGlzIDExLjA3MDUuIE91ciBjYWxjdWxhdGVkIENoaS1zcWF1cmUgdmFsdWUgaXMgODcwLjk5NTM4ICwgd2hpY2ggaXMgbW9yZSB0aGFuIHRoZSBjaGktc3F1YXJlIGNyaXRpY2FsIHZhbHVlICx0aHVzIHdlIGNhbiBoYXBwaWx5IHJlamVjdCBudWxsIGh5cG90aGVzaXMgYW5kIHNheSB0aGVyZSBpcyBhbiBhc3NvY2lhdGlvbiBvZiBhZ2Ugd2l0aCBnZW5kZXIuQWxzbyAsIGZvciBmdXJ0aGVyIHByb29mIHAgdmFsdWUgaXMgbGVzcyB0aGFuIDAuMDUgd2hpY2ggaXMgYWxzbyBhIHByb29mIG9mIGFzc29jaWF0aW9uLkFsc28sd2UgY2FuIHNlZSB0aGF0IHRoZSBiYXJzIHZhcnkgaW4gaGVpZ2h0LklmIHRoZXJlIHdhcyBubyBhc3NvY2lhdGlvbiB0aGUgYmFycyB3b3VsZG7igJl0IHZhcnksaGVuY2UgYXNzb2NpYXRpb24gIG9mIGdlbmRlciB3aXRoIGFnZSBpcyBwcm92ZWQuDQoNCiMjIFJlZnJlbmNlcyAgDQoxLiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvTUFTUy92ZXJzaW9ucy83LjMtNTIvdG9waWNzL2ZpdGRpc3RyDQoyLiBodHRwczovL2RhdGEuZ292LmF1L2RhdGEvZGF0YXNldC9hdXN0cmFsaWFuLXJvYWQtZGVhdGhzLWRhdGFiYXNlDQozLiBodHRwczovL3d3dy5yLXR1dG9yLmNvbS9lbGVtZW50YXJ5LXN0YXRpc3RpY3MvaHlwb3RoZXNpcy10ZXN0aW5n