Introduction

Following the World Health Organisation’s declaration of a COVID-19 pandemic on March 11th 2020, Western Australia implemented it’s toughest restrictions from March 31st (McNeill)
These restrictions included closing the border, closing schools, a ban on travel between regions, people being encouraged to stay at home and all but essential services and businesses being closed
These restrictions were gradually eased from April 27th
Many people continued to work from home even as restrictions were removed

Problem Statement

Did these restrictions, and the fact people were spending more time at home, have a statistically significant impact on burglary rates?
Based on burglary rates prior to the COVID lockdown, how likely was it to record the number of burglaries that were recorded during the period of restrictions?
A two-tailed, one-sample t-test was used to determine if the mean number of burglaries per month during April, May and June 2020 was significantly different to the mean monthly number of burglaries between January 2007 and March 2020.

Data - Burglaries in Western Australia

Number of burglaries recorded in WA from January 2007 to June 2020.
Data published by the Western Australia Police Force.
https://www.police.wa.gov.au/Crime/CrimeStatistics#/

Data - Scan

Scan the dataset for any missing data, special values such as NaN or infinite values, obvious errors and inconsistencies

sum(is.na(crime))

## [1] 0

sum(sapply(crime, is.infinite))

## [1] 0

sum(sapply(crime, is.nan))

## [1] 0

sum(crime$`Burglary Total`<0)

## [1] 0

Data - How is the data distributed?

The histogram of the burglary data shows an approximate normal distribution and some possible outliers at the lower end

# Plot histogram of burglaries with a normal distribution overlay
hist(crime$`Burglary Total`, main="Distribution of Burglaries", xlab="Burglaries", freq=FALSE)
curve(dnorm(x, mean=mean(crime$`Burglary Total`), sd=sd(crime$`Burglary Total`)),
      col="red", lwd=2, n=100, add=TRUE)

Data - Variables

The total number of burglaries along with the month and year variable were extracted from the complete dataset
New variables were created to store the month and year separately
The ‘Burglary Total’ and ‘Year’ variables are numeric, while the ‘Month’ variable is an ordered factor variable with 12 levels.

# Create new variables for year and month
crime <- crime[,c(1,33)]
crime <- mutate(crime, "Year"=year(`Month and Year`))
crime <- mutate(crime, "Month"=month(`Month and Year`, label=TRUE, abbr=FALSE))
crime <- crime[,-1]

Data - Does it fit a normal distribution?

The plot below confirms that the distribution closely follows a normal distribution and supports the theory of there being three outliers present

crime$`Burglary Total` %>% qqPlot(dist="norm")

## [1] 162 161

Data - When did these outliers occur?

Given the burglaries data has been shown to approximate a normal distribution, the normal scores method was used to identify the three outliers.
These outliers were found to be from April, May and June 2020 and were separated from the remaining dataset.

# Identify outliers
z.scores <- crime$`Burglary Total` %>% scores(type="z")
z.scores %>% summary()

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -4.30369 -0.59174  0.09832  0.00000  0.61522  2.00945

crime[which(abs(z.scores)>3),]

# Remove outliers and save as "covid"
covid <- crime[c(which(abs(z.scores)>3)),]
# Save remainder as "preCovid"
preCovid <- crime[-c(which(abs(z.scores)>3)),]

Descriptive Statistics and Visualisation

The mean number of burglaries in the data prior to April 2020 was found to be 2924, with a standard deviation of 324.
The mean number of burglaries from April to June 2020 inclusive was found to be 1289.

# Mean and standard deviation of the number of burglaries prior to COVID-19 lockdown
mean(preCovid$`Burglary Total`)

## [1] 2923.943

sd(preCovid$`Burglary Total`)

## [1] 323.863

# Mean number of burglaries from April to June 2020
mean(covid$`Burglary Total`)

## [1] 1289.333

Descriptive Statistics and Visualisation

The histogram below shows the burglary data with the outliers removed

hist(preCovid$`Burglary Total`, main="Distribution of Burglaries", xlab="Burglaries", freq=FALSE)
curve(dnorm(x, mean=mean(preCovid$`Burglary Total`), sd=sd(crime$`Burglary Total`)),
      col="red", lwd=2, n=100, add=TRUE)

Hypothesis Testing

It could be said that the odds of recording a number of burglaries during a month that were as low as that recorded during the COVID-19 lockdown were “a million to one”
To test this hypothesis \(\mu_1\) was defined as the mean monthly number of burglaries between April and June 2020 and \(\mu_2\) was defined as the mean monthly number of burglaries between January 2007 and March 2020.
So the null hypothesis was that the mean of the number of burglaries recorded during the April to June lockdown period would not be significantly different to the mean of monthly burglaries from January 2007 to March 2020. \[H_0: \mu_2 = \mu_1\]
The alternate hypothesis thus became that the mean of the number of burglaries recorded during the April to June lockdown period was significantly different to the mean of monthly burglaries from January 2007 to March 2020. \[H_A: \mu_2 \ne \mu_1\]

Hypothesis Testing Cont.

Although the sample size was small, being only three months, the fact that even the 99.9999% confidence interval does not capture the mean number of burglaries of these three months, or even any of the 3 months individually, suggests there is a statistically significant difference between the number of burglaries recorded during the April to June 2020 period and the January 2007 to March 2020 period.

# Apply t-test using a confidence interval of 99.9999%
t.test(preCovid$`Burglary Total`, conf.level = 0.999999)

## 
##  One Sample t-test
## 
## data:  preCovid$`Burglary Total`
## t = 113.84, df = 158, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 99.9999 percent confidence interval:
##  2793.176 3054.711
## sample estimates:
## mean of x 
##  2923.943

Discussion

Using a two-tailed, one sample t-test, it was shown that the number of burglaries during the April to June 2020 period was significantly lower than the mean during the period of January 2007 to March 2020. A 1.0*E-6 level of significance was used. The data for the pre-COVID period had a mean of 2924 burglaries per month, with a standard deviation of 324. The results of the one-sample t-test found the mean of the number of monthly burglaries during the COVID lockdown to be statistically significantly lower than the pre-COVID period, n(3) = 1289, p<0.001, 99.9999% CI [2973, 3054].
While the post lockdown sample size is small, the significant difference from the pre lockdown data could have been mistakenly discarded as outliers if the COVID-19 lockdown at the time was not known about.
The pre COVID data seemed to remain within a range, approximately 2300-3500 burglaries per month, with very little sign of any longer term trend.

Discussion

As further time passes and a larger post COVID dataset becomes available, further investigation could be made into this phenomenon. Even with almost all restrictions related to COVID being lifted in Western Australia, many businesses continue to have employees working from home, for at least some of the time. The existing data would suggest that this would lead to a continued recording of lower numbers of burglaries.
The impact on other crimes could also be investigated for any similar impact. With more people at home for longer, domestic violence for example, may have increased during this time.

References

McNeill H, 2020, ‘A timeline of WA’s COVID-19 response: Was our success luck, good management, or a bit of both?’, WAToday, Perth WA, 28 August, https://www.watoday.com.au/national/western-australia/a-timeline-of-wa-s-covid-19-response-was-our-success-luck-good-management-or-a-bit-of-both-20200827-p55q03.html
State Government of Western Australia, 2020, Crime Statistics, Western Australia Police Force, Perth WA, viewed 2 October 2020, https://www.police.wa.gov.au/Crime/CrimeStatistics#/

COVID-19 and Crime in Western Australia

Did the lockdown associated with COVID-19 have a statistically significant impact on burglaries in Western Australia?

RPubs link information

Introduction

Problem Statement

Data - Burglaries in Western Australia

Data - Scan

Data - How is the data distributed?

Data - Variables

Data - Does it fit a normal distribution?

Data - When did these outliers occur?

Descriptive Statistics and Visualisation

Descriptive Statistics and Visualisation

Hypothesis Testing

Hypothesis Testing Cont.

Discussion

Discussion

References