Amanjeet Singh Randhawa s3869950
25th October 2020
An online presentation of the findings can be found here:
Crime is a multifaceted occurrence and remains a constant topic of discussion.
Exploring the factors influencing the crime rate at any given time could provide valuable information for important policy decisions.
Here we will explore the relationship between average weekly earnings on the crime rate in Australia for the periods 2010-2019.
Can we predict the crime rate using earnings data?
Using simple linear regression and correlation, we will examine the strength of the relationship between crime rate and earnings.
Crime [2], wage [3] and population [4] data was collected from the Australian Bureau of Statistics website on October 24th 2020. After importing into R, the data was tidied and manipulated in order to produce a format appropriate for statistical analysis. This included:
The final dataset contained 6 variables:
c3 <- rainbow(10, alpha=0.2)
c2 <- rainbow(10, v=0.7)
boxplot(Combined_mutated$Crime_percapita ~ Combined_mutated$State,
names = c("ACT", "NSW", "NT", "QLD", "SA", "TAS", "VIC", "WA"),
main = "Australian Crime rate per 100 persons by State", ylab = "Crimerate per 100 persons",
xlab = "State", col = c3, outcol = c3, medcol = c2)From the plot we observe an outlier for ACT. Upon further investigation, this corresponds to the observation for 2010. As this could indicate an important feature of the data, it is included in the final dataset.
Northern Territory and Western Australia have the highest median crime rate while Tasmania has the lowest.
Earnings_plot <- Combined_mutated %>% ggplot(aes(x = Year, y = Earnings, color=`State`)) +
geom_point() + geom_line(aes(group = `State`)) +
labs(title = "Australian Average Weekly Earnings by Year")
Earnings_plot1 <- Earnings_plot + labs(y = "Earnings (AUD)", x = "Year")
Earnings_plot1All states had an increase in average weekly earnings with the rate of increase being fairly constant.
Australian Capital Territory and Northern Territory had the greatest average weekly earnings while Tasmania consistently had the lowest.
Combined_mutated %>% group_by(State) %>% summarise(Min = min(Crime_percapita,na.rm = TRUE),
Median = median(Crime_percapita, na.rm = TRUE),
Max = max(Crime_percapita,na.rm = TRUE),
Mean = mean(Crime_percapita, na.rm = TRUE),
SD = sd(Crime_percapita, na.rm = TRUE),
)plot(Combined_mutated$Earnings, Combined_mutated$Crime_percapita,
main = "Crime rate by Average weekly earnings",
xlab = "Average Weekly Earnings(AUD)", ylab = "Crime rate per 100 persons")
abline(lm(Combined_mutated$Crime_percapita~Combined_mutated$Earnings), col = "red")r <- cor(Combined_mutated$Earnings, Combined_mutated$Crime_percapita)
CIr(r = r, n = 80, level = .95)## [1] 0.2379259 0.5975268
## Earnings Crime_percapita
## Earnings 1.0000000 0.4349074
## Crime_percapita 0.4349074 1.0000000
From the plot we observe a positive linear relationship between crime rate and earnings. Furthermore, the strength of the positive correlation was statistically significant, \(r = .43, p<.001\), 95%CI[.24, .60].
We can therefore continue with linear regression.
Earningscrimemodel <- lm(Crime_percapita ~ Earnings, data = Combined_mutated)
plot(Earningscrimemodel) From these plots we can confirm:
It is therefore safe to continue with linear regression.
\[H_0: Crime~rate~ and~ earnings~ do~ not~ fit~ the~ linear~ regression~ model \]
\[H_A: Crime~ rate~ and ~earnings~ data ~fit~ the~ linear~ regression~ model\]
##
## Call:
## lm(formula = Crime_percapita ~ Earnings, data = Combined_mutated)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.16349 -0.94573 -0.03454 0.86428 2.98666
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.2830818 1.0123460 0.280 0.781
## Earnings 0.0035971 0.0008433 4.266 5.55e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.17 on 78 degrees of freedom
## Multiple R-squared: 0.1891, Adjusted R-squared: 0.1787
## F-statistic: 18.19 on 1 and 78 DF, p-value: 5.547e-05
From the summary, we can confirm \(p<.001\)therefore we reject \(H_0\). Thus, there was statistically significant evidence that the data fit a linear regression model.
After calculating a Pearson’s correlation coefficient, we found a positive linear relationship between crime rate and earnings. Furthermore, the strength of the positive correlation was statistically significant, \(r = .43, p<.001\), \(95%\) CI[.24, .60].
After assessing the bivariate relationship between Crime rate and average weekly earnings, there was evidence of a positive linear relationship. The regression model was statistically significant, \(F(1, 78) = 18.19\), \(p<.001\) and explained 18.9%of the variability in crime rates, \(R^2 = 0.189\).
As average weekly earnings increased, so did crime rates. This is contrary to the expectation that crime rate would decrease as wages went up.
Strengths of the study include the recording methods used, whereby the sample sizes for measures was large. Furthermore, as the data was collected by Government agencies, the data likely provides an accurate model of the population.
Limitations include the fact that different states may have different definitions of what a crime entails which would impact the number of crimes recorded. Furthermore, as assault cases were omitted this may cause the results to vary based on the assumption that assault rates are different by state.
As the average weekly earnings data was used for this analysis, a number of limitations arise. This includes the fact that the earnings may be skewed by composition of the workforce in any given recorded period. Additionally, the presence of a wealth gap would likely skew the results of any findings.
In future analysis, controlling for the distribution of earnings as well as crime type would help elucidate the quality and strength of the relationship. Furthermore, exploring the impact of additional predictor variables would provide a more complete picture of the crime rate.
[1] https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
[2] https://www.abs.gov.au/statistics/people/crime-and-justice/recorded-crime-victims-australia/latest-release
[3] https://www.abs.gov.au/methodologies/average-weekly-earnings-australia-methodology/may-2020#glossary
[4] http://stat.data.abs.gov.au/Index.aspx?DataSetCode=ERP_QUARTERLY#