For the data analysis project my group examined the relationship between crime rates in Local Government Areas (LGAs) and a range of independent variables including population density, socio-economic levels and unemployment. For my individual project exploration, I wanted to further examine the main theme crime but narrow my focus to two specific LGAs “Sydney CBD” and “Inner West”. In February 2014, the New South Wales Coalition government introduced 1.30am lockout and 3am last drink laws for the Sydney CBD. The public perception of violence increasing was what led to the laws being introduced, but trends in violent crime reported to police since the early 1990s reveal a mixed story (Bricknell 2008). The subsequent controversies about the ‘lockout laws’ in Sydney have provoked a series of debates encompassing crime, moral, social and cultural discourses (Homan, 2018).
The public’s perception of crime can have important influence on policy decisions relating to operational activity in front line law enforcement and in judicial sentencing. However, there can be a discrepancy between the public’s perception of the likelihood of crime victimisation and the actual risk of victimisation. This discrepancy is apparent in the public’s concern regarding a perceived increase in crime amidst declining crime rates (Davis 2010). According to the Australian Bureau of Statistics (2008), the number of crimes reported to police was lower in 2007 than in 1998 in almost all categories. In this report I wish to examine trends in crime over time within the two LGAs previously mentioned and assess the impact the introduction of the lockout laws has had. The analysis conducted has limitations due to the following;
Lack of specificity in the data available (Crime data is not subcategorised to encompass all alcohol related offences). This report will focus on overall incidences of crime within the LGAs.
The analysis conducted is observational, which makes it difficult to determine causal relationships due to confounding variables such as population growth.
Specifically, this report it aims to answer the following research questions:
The crime data utilised was obtained from the Australian Bureau of statistics using the R package readabs. The data retrieved details monthly incidences of crime within LGAs from across NSW, with the data ranging from January 1995 to March 2019.
In order to analyse trends in crime over time within the Sydney and Inner West LGAs I will be using time series analysis to extract meaningful statistics and other characteristics of the data. Initially I will plot the data to visualise the time series. Visualising is an essential first step as it allows us to identify unusual observations and existing patterns. It will also identify trends after the introduction of the lockout laws in February 2014. Following this I will use an ARIMA model to forecast incidences of crime based on previously observed values.
The report can be broken down into the following steps;
Visualising the time series data is an essential first step of any time series model as it allows us to identify unusual observations and understand any existing patterns. Figure 1 below displays the monthly incidences of crime within the Sydney LGA over the past 23 years. Incidences of crime as demonstrated below exhibit a decreasing trend from the year 2000 with no discernible impact from the lockout laws implemented in February 2014, with crime increasing during 2015 then trending downwards again through to 2019.
Figure 1
Immediately, it is clear that there is a pattern of peaks and troughs which indicates seasonal changes have an effect on the incidences of crime. Weather patterns modify people’s routines and, in turn, influence when crime is committed (Linning, 2017). In the Sydney time series, you can clearly identify the trend of crime peaking in the Summer months of December through February and decreasing in Winter with the lowest incidences of crime generally occurring from June through August. (See Appendix A)
To better understand whether there is a trend or seasonal patterns, the time series can be decomposed. Decomposing a time series reveals the following components within the data;
Trend - Whether there is a long-term increase or decrease in the data.
Seasonal - When a time series is affected by seasonal factors such as the time of the year.
Noise - The random values
Figure 2 shows the decomposition of the Sydney LGA time series. The decomposition shows that there is a decreasing trend over time from the year 2000 and there is also a clear seasonal pattern occurring.
Figure 2
In Figure 3 below the Sydney time series is plotted alongside the time series of the Inner West. From 2007 to 2017 populations increased in both the Sydney and Inner west LGAs by 27.5% and 12.62% respectively (Cavanough, 2019). As population size increases so does the likelihood of incidences of crime. Despite this we still see an overall decreasing trend in crime over time.
Figure 3
Geographically targeted crime control to alleviate crime by targeting “hot spots”, risks the potential displacement of crime into bordering areas. The 2014 Sydney lockout laws have severely decreased the nightlife economy in the once bustling entertainment district of the CBD, and there have been reports of increased violence in displacement areas (Perks, 2016). I wanted to examine the relationship in crime between the two LGAs after the introduction of the lockout laws to see if this was the case. It appears that some effect of displacement of crime has occurred with an increase in incidences of crime peaking in the Inner West in January 2015 at 1,580 (See Appendix B), a level of crime not seen since 2010. However, crime seems to stabilise from January with a continuing downward trend through to 2019.
Further to the analysis in Figure 3 I computed the cross-correlation of the two univariate time series using the ccf1 function (See Appendix C). Figure 4 below illustrates that a positive correlation exists between the two time series at any time where the lag is greater than -1.5 and only weak negative correlations existing when lag is less than -1.5. This suggests that there is no statistical significance of spill over of crime to the Inner west LGA.
Figure 4
In addition to the relationship between the two-time series, I wanted to forecast the crime rate of the Sydney LGA using the ARIMA model. The foundation for statistical inference in time series analysis is the concept of stationarity. Fitting an ARIMA model, requires the time series data to be stationary. To test for stationarity, we will be using the Augmented Dickey-Fuller Test within the t series package. Based on the Augmented Dickey-Fuller Test the Sydney time series fulfils the requirement of stationarity (See Appendix D).
With the requirement of stationarity fulfilled we can proceed with modelling the data using the ARIMA model. In order to build a reasonable ARIMA model, we can first use the auto.arima() function in the forecast package. This function optimises the parameters of the model to estimate the following:
p: The number of lag observations included in the model
d: The number of times that the raw observations are differenced
q: The size of the moving average window
P: The seasonal number of lag observations included in the model
D: The seasonal number of times that the raw observations are differenced
Q: The seasonal size of the moving average window
(Appendix E) is a summary of the model returned by auto.arima(). The algorithm has chosen an ARIMA model with three lag orders (p=3), one degree of differencing (d=1), and two seasonal lag observations (P=2).
The residuals in a time series model are what is left over after fitting a model. Residuals are useful in checking whether a model has adequately captured the information in the data. (Hyndman & Athanasopoulos, 2018). The checkresiduals() function was used to check the residuals for the Automated ARIMA model. It produced a time plot, an ACF plot and histogram of the residuals as seen in Figure 5.
Figure 5
The time plot and histogram show that the model produced forecasts that appear to account for all available information. The time plot of the residuals shows that the variation of the residuals are consistent across the historical data. This can also be seen on the histogram of the residuals, which are normally distributed. However, the mean of the residuals has a large autocorrelation, particularly between lag 22 and 24, indicating that the residuals are not all white noise and a reduction in seasonal number of lags may be required in the model. The residuals check from the automated ARIMA model indicated the seasonal number of lags should be reduced to one on the new model and seasonal Moving Average (Q) to 1. (Appendix F) provides a summary returned by the new model. In other words, p=3, d=1, q=1, P=1, D=0 and Q=1, as seen in Figure 6 below:
Figure 6
With the residual analysis demonstrating that the updated ARIMA model has passed the required checks. We can now perform forecasting using the fitted model. We can specify forecast horizon h periods ahead for predictions to be made, and use the fitted model to generate those predictions. Using the updated ARIMA model shows the predicted incidences of crime in the 12 months from March 2019 in Figure 7. The forecasts are shown as a blue line, with the 80% prediction intervals as a dark shaded area, and the 95% prediction intervals as a light shaded area. As demonstrated in Figure 7 below a continuing downward trend of crime is forecasted for the next 12 months following a seasonal trend. To see the forecasted values for the 12 months, refer to (Appendix G).
Figure 7
The objective of this research report was to understand the trends in crime over the past two decades within the Sydney and Inner west LGAs. Further to this it was to analyse the relationship between the two LGAs and see what effect the introduction of the lockout laws had. Based on the findings we can conclude that there is a general decreasing trend in incidences of crime in the past two decades and there appears to be no statistical significance in the relationship between the two LGAs. Based on the forecast using the ARIMA model incidences of crime are expected to fall in following 12 months, in line with general seasonal trends.
The introduction of the lockout laws by the NSW coalition government appears to have been based on a reaction to isolated incidents of violent crime rather than an analysis of crime data. As debates continue on the merit of the laws the NSW government should look to make more informed legislative decisions through the analysis of data.
The analyses that has been conducted is observational, which makes it difficult to determine causal relationships due to confounding variables such as increasing populations within the LGAs.
The data available are limited in their specificity, i.e. the crime data is aggregated at a yearly level and is not subdivided by age or sex.
Bricknell S. 2008. Trends in violent crime. Trends & issues in crime and criminal justice No. 359. Canberra: Australian Institute of Criminology. https://aic.gov.au/publications/tandi/tandi359
Cavanough, Edward, 2019. Even growth planning for a growing Sydney. The Mckell Institute. Commissioned by unions NSW.
Davis B & Dossetor K. 2010. (Mis)perceptions of crime in Australia. Trends & issues in crime and criminal justice No. 396. Canberra: Australian Institute of Criminology. https://aic.gov.au/publications/tandi/tandi396
Homan, S 2018, ‘’Lockout’ laws or ‘rock out’ laws? Governing Sydney’s night-time economy and implications for the ‘music city’’ The International Journal of Cultural Policy. https://doi.org/10.1080/10286632.2017.1317760
Hyndman, R. J & Athanasopoulos, G. 2018, Forecasting: Principles and Practice, 2nd edn. Monash University, Australia
Linning, Shannon J.2017, Crime Seasonality across Multiple Jurisdictions in British Columbia, Canada, Volume 59 Issue 2, April 2017, pp. 251-280
Perks, Georgia 2016. The “Flock” Phenomenon of the Sydney Lockout Laws: Dual Effects on Rental Prices. University of Technology Sydney
Appendix A
Appendix B
Appendix C
Appendix D
Appendix E
Appendix F
Appendix G