James Twigg s3874566
Last updated: 25 October, 2020
According to the Crime Statistics Agency (CSA), the rate of criminal incidents in Victoria has been gradually increasing since 2011, as can be seen in figure 1 below (2020c). Their website notes, “that movements in recorded crime data may be impacted by changes in legislation and operational police practice,” which could account for some of the increase (2020c). It would be important to check, however, whether a particular region in the state accounts for an unusually high or low portion of the data. The purpose of this investigation was to see if, in 2020, there was a substantial difference in the rate of criminal incidents between the Eastern and Western sides of regional Victoria, Australia.
Victoria is divided into four separate policing regions, two representing metropolitan councils, and two representing regional councils. As can be seen in figure 2, the regional councils account for a substantially larger portion of land than the metropolitan areas (Victoria Police 2020). The Eastern and Western police regions are similar in a number of geographical and sociological ways, so a divergence between the two on rates of criminal incidents would be concerning. After proceeding with a two-sample \(t\)-test on the two populations, however, this analysis found no statistically significant difference between their means. The following report shows the process that lead to this conclusion and ends by suggesting how to extend this investigation.
Figure 2: Geographical borders of all four Victorian police regions. Source: https://www.police.vic.gov.au/about-victoria-police
The aim of this analysis was to determine if there is a statistically significant difference between the means of crime rates in the Eastern and Western police regions of Victoria, Australia in 2020. This was determined using a two-sample \(t\)-test.
crime.police.region was converted to an non-ordered factor type.crime <- readxl::read_excel("crime.xlsx", sheet = 2)
crime <- crime %>% dplyr::rename(lga.name = `Local Government Area`,
year = Year,
year.ending = `Year ending`,
police.region = `Police Region`,
criminal.incidents = `Incidents Recorded`,
incident.rate = `Rate per 100,000 population`)
crime$police.region <- crime$police.region %>%
factor(levels = c("1 North West Metro", "2 Eastern", "3 Southern Metro", "4 Western"),
labels = c("North West Metro", "Eastern", "Southern Metro", "Western"),
ordered = FALSE)
crime <- crime %>%
filter(year == 2020,
lga.name != "Total" &
lga.name !="Justice Institutions and Immigration Facilities" &
lga.name !="Unincorporated Vic")Provided by the Crime Statistics Agency (CSA) (2020a), this data gives the number of criminal incidents recorded and the rate of occurrence per 100,000 people in Victoria, grouped by area and ranging over the years 2010 to 2020. As for how it was obtained,
"The crime statistics produced by the CSA are derived from administrative information recorded by Victoria Police and extracted from the LEAP database" (CSA 2020b)
This data was obtained from the CSA website, the link for which can be found in the references. The variables are,
Year (Interval): Year values ranging from 2011 to 2020.
Year ending (Interval): Paired with the year column to indicate the reference period for each observation. The reporting period is between 1 July of the previous year to 30 June of the year in the same row. The column thus exclusively contains the string value ‘June’.
Police Region (Nominal): Broad, regional policing boundaries in Victoria, including ‘North West Metro’, ‘Southern Metro’, ‘Eastern’ and ‘Western’.
Local Government Area (Nominal): The 79 distinct local government areas (LGA) in Victoria. Each council is located in one, and only one, police region.
Incidents Record (Ratio): Number of criminal incidents reported in each council. Incidents are described by the CSA (2020b) as, “a criminal event that may include multiple offences, alleged offenders and/or victims that is recorded on the LEAP database on a single date and at one location.”
Incident Rate (Ratio): Proportion of incidents in a council’s population per 100,000 people. This is calculated by the CSA (2020b) as,
"Offence rate = (Offence count/ERP count) *100,000"
where ERP stands for “Estimated Resident Population.”
The CSA website can be referred to for more detailed information: https://www.crimestatistics.vic.gov.au/about-the-data/explanatory-notes#Reference%20periods
The mean incident rates, provided below by this code for descriptive statistics, will be given particular attention in this investigation. It was important to check for normality, particularly in the Eastern group of data, given the smaller sample size, \(n = 25\). Histograms are provided on the next slide for an initial glimpse at the distributions.
knitr::kable(crime %>%
group_by(police.region) %>%
filter(police.region == "Eastern"|police.region == "Western") %>%
summarise(Min = min(incident.rate, na.rm = TRUE),
Q1 = quantile(incident.rate, probs = .25,na.rm = TRUE),
Median = median(incident.rate, na.rm = TRUE),
Q3 = quantile(incident.rate, probs = .75,na.rm = TRUE),
Max = max(incident.rate, na.rm = TRUE),
Mean = mean(incident.rate, na.rm = TRUE),
SD = sd(incident.rate, na.rm = TRUE),
n = n(),
Missing = sum(is.na(incident.rate))))| police.region | Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|---|
| Eastern | 2578.353 | 3847.147 | 4909.159 | 6599.376 | 13290.646 | 5428.292 | 2397.440 | 25 | 0 |
| Western | 1939.070 | 4284.694 | 5513.043 | 6849.551 | 9866.524 | 5523.377 | 1885.638 | 30 | 0 |
lattice::histogram(~incident.rate|police.region,
data = (crime %>%
filter(police.region == "Eastern"|police.region == "Western")))The difference between the means of the Eastern (5428.292) and Western (5523.377) police regions do not seem that significant. Yet, the box plots for each group show that there is a potential outlier in the Eastern region. This area is Latrobe, which had a crime rate that was the “highest of any region outside of Melbourne” (Withers 2020). This put the area second only to The City of Melbourne.
crime %>%
filter(police.region == "Eastern"|police.region == "Western") %>%
boxplot(incident.rate ~ police.region, data = ., ylab = "Incidents per 100,000 people")This is a legitimate and noteworthy data point, but given Latrobe stands so far away from the rest of the data, it could be that the rest of the Eastern police regions tend to have lower crime rates than the West.
crime_east_west <- crime %>%
filter(police.region == "Eastern"|police.region == "Western",
incident.rate < 12000)
boxplot(crime_east_west$incident.rate ~ crime_east_west$police.region, ylab = "Incidents per 100,000 people", xlab = "Police Regions")By excluding that data point, the new mean for the region (5100.693) dropped by over 300 incidents per 100,000 people, widening the East-West gap. Given a noticeable difference between the mean crime rates for the Eastern and Western police regions, the two-sample \(t\)-test was used to determine if the difference is statistically significant. Before this, both groups were tested for normality and homogeneity of variance.
crime_east_west %>%
group_by(police.region) %>%
summarise(Min = min(incident.rate, na.rm = TRUE),
Q1 = quantile(incident.rate, probs = .25,na.rm = TRUE),
Median = median(incident.rate, na.rm = TRUE),
Q3 = quantile(incident.rate, probs = .75,na.rm = TRUE),
Max = max(incident.rate, na.rm = TRUE),
Mean = mean(incident.rate, na.rm = TRUE),
SD = sd(incident.rate, na.rm = TRUE),
n = n(),
Missing = sum(is.na(incident.rate)))The data for both regions followed the diagonal of their Q-Q plots well, allowing normality to be assumed.
crime_east <- crime_east_west %>% filter(police.region == "Eastern")
crime_east$incident.rate %>%
qqPlot(dist="norm", main = "Eastern Regions", ylab = "Incidents per 100,000 people")## [1] 7 6
crime_west <- crime_east_west %>% filter(police.region == "Western")
crime_west$incident.rate %>%
qqPlot(dist="norm", main = "Western Regions", ylab = "Incidents per 100,000 people")## [1] 18 10
Homogeneity of variance was tested using Levene’s test. The statistical hypotheses were,
\[H_0: \sigma_1^2 = \sigma_2^2 \]
\[H_A: \sigma_1^2 \ne \sigma_2^2\]
The \(p\)-value for the Levene’s test of equal variance for crime rates between Eastern and Western police regions was \(p = 0.7629\). Given \(p > .05\), the conclusion was to fail to reject \(H_0\). Equal variance could safely be assumed.
| Df | F value | Pr(>F) | |
|---|---|---|---|
| group | 1 | 0.0919478 | 0.7629252 |
| 52 | NA | NA |
The hypotheses for the two-sample \(t\)-test were:
\[H_0: \mu_1 - \mu_2 = 0 \]
\[H_A: \mu_1 - \mu_2 \ne 0\] The results of the two-sample \(t\)-test assuming equal variance did not find a statistically significant difference between the mean criminal incident rates of the Eastern and Western police regions in Victoria:
Therefore, the conclusion was to fail to reject \(H_0\).
t.test(
incident.rate ~ police.region,
data = crime_east_west,
var.equal = TRUE,
alternative = "two.sided"
)##
## Two Sample t-test
##
## data: incident.rate by police.region
## t = -0.83735, df = 52, p-value = 0.4062
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1435.6068 590.2402
## sample estimates:
## mean in group Eastern mean in group Western
## 5100.693 5523.377
## [1] -2.004879
Despite a noticeable difference between the mean criminal incident rates, Eastern regions’ being approx 8% lower than Western regions’, the two-sample \(t\)-test did not find a statistically significant difference between them. This was a useful finding, as it suggests a lack of abnormality in the two regions, apart from the Latrobe council.
It would be beneficial to look at crime statistics for the Latrobe council in itself. Russell Northe, a local government representative for the area, suggests that, “Much of the crime rate is in and around family and domestic violence,” but further analysis could highlight why this is so pronounced compared to adjacent areas (Withers 2020).
While the focus of this investigation was on regional Victoria, the same could be done for metropolitan Melbourne. It could also be expanded as a one-way ANOVA, including all the regions in one test. While higher crime rates would be anticipated in the metropolitan regions, the focus of the investigation should search for abnormalities. If metropolitan regions turned out to have comparable crime rates, for instance, assumptions about rural and urban differences could be challenged.
Figure 3. Source: https://communitysafety.vic.gov.au
Crime Statistics Agency 2020a, Data Tables - LGA Criminal Incidents Visualisation - year ending June 2020, data file, Australian Government, Melbourne, viewed 19 October 2020, https://www.crimestatistics.vic.gov.au/crime-statistics/latest-victorian-crime-data/download-data
Crime Statistics Agency 2020b, Explanatory Notes, Crime Statistics Agency, viewed 25 October 2020, https://www.crimestatistics.vic.gov.au/about-the-data/explanatory-notes#Reference%20periods
Crime Statistics Agency 2020c, Recorded Criminal Incidents, Crime Statistics Agency, viewed 25 October 2020, https://www.crimestatistics.vic.gov.au/crime-statistics/latest-victorian-crime-data/recorded-criminal-incidents
The Victorian Government 2020, Making Victoria safer, The Victorian Government, viewed 25 October 2020, https://communitysafety.vic.gov.au
Victoria Police 2020, About Victoria Police, Victoria Police, viewed 25 October 2020, https://www.police.vic.gov.au/about-victoria-police
Withers, K 2020, ‘Latrobe Valley crime rate highest of any region outside Melbourne’, Latrobe Valley Express, 25 June, viewed 25 October 2020, https://www.latrobevalleyexpress.com.au/story/6805838/crime-on-the-rise/#