DATA1001 Project 2 EDA

Author

SID 560644890

Visualisations

Discussion

This analysis utilises the dataset sourced from the NYPD Complaint Data Historic to explore the correlation between types of crime and gender. The proportionalised barchart demonstrates a substantial overrepresentation of male offenders in all categories, particulary in white collar crimes and sex offenses. A chi-squared test rejects the null hypothesis of independence, confirming a stastically significant association between crime type and suspect gender (x^2 = 810.38, p-value < 2.2e^-16). However, the retained Cramer’s V value of 0.119 indicates the overall strength of this relationship is weak. Analysing standardised residuals identifies primary benefactors to this association, with a large overrepresentation of male suspects in property and sex-related offenses, and female suspects in public order offenses. The joint distribution of suspect and victim sex exhibits asymmetry with prevalent patterns (e.g. male offender/ female victim in sex-related offenses), indicating gender dynamics as conditional on categorised crime, not uniformly distributed.

Acknowledgements

AI Usage:

ChatGPT version 5.3 (https://chatgpt.com/) developed by OpenAI was used to assist in coding visualisations/ finding appropriate packages to use of RStudio, as well as debugging code.