DATA1001 Project 2 EDA
Visualisations
Discussion
This analysis utilises the dataset sourced from the NYPD Complaint Data Historic to explore the correlation between types of crime and gender. The proportionalised barchart demonstrates a substantial overrepresentation of male offenders in all categories, particulary in white collar crimes and sex offenses. A chi-squared test rejects the null hypothesis of independence, confirming a stastically significant association between crime type and suspect gender (x^2 = 810.38, p-value < 2.2e^-16). However, the retained Cramer’s V value of 0.119 indicates the overall strength of this relationship is weak. Analysing standardised residuals identifies primary benefactors to this association, with a large overrepresentation of male suspects in property and sex-related offenses, and female suspects in public order offenses. The joint distribution of suspect and victim sex exhibits asymmetry with prevalent patterns (e.g. male offender/ female victim in sex-related offenses), indicating gender dynamics as conditional on categorised crime, not uniformly distributed.
Acknowledgements
AI Usage:
ChatGPT version 5.3 (https://chatgpt.com/) developed by OpenAI was used to assist in coding visualisations/ finding appropriate packages to use of RStudio, as well as debugging code.