In this document we will be looking at some of the interesting statistics involving requested rodent baiting efforts by the city of Chicago.

The data we will be examining are from the Chicago data portal website, https://data.cityofchicago.org/. The data was downloaded on March 5, 2016 from the rodent baiting page https://data.cityofchicago.org/Service-Requests/311-Service-Requests-Rodent-Baiting/97t6-zrhs. This page contains all service requests to the 311 service line since January 1, 2011. Once a request is made, the Department of Streets & Sanitation investigates the report. From there, Rodenticide can be applied to burrows to eradicate nests, among other services.

The data were downloaded in the form of a .csv file. Characteristics of the data include a creation date, the date when the request was made, between January 1 and December 31 of the year 2013. All records have a status of “completed” with records marked as “Completed - Dup” excluded. All records have as current activity as “Dispatch Crew” and most recent action as “Inspected and Baited”.

Once the data have been properly filtered and downloaded, we can load the data set into memory. We will also load data from a separate .csv file, which I will call areas2.csv. This file contains the number corresponding to each community area, as well as the actual name of the community area and what “side” the area belongs to. The data were gained from this Wikipedia page. The data sets are then merged into a new set containing information from the previous two. Finally, we will load the ggplot2 library for possible plots.

mice <- read.csv('311rodent.csv',header=T,sep=',')
areas <- read.csv('areas2.csv',header=T,sep=',')
mice2 <- merge(areas,mice,by.x='Community.Area')
mice2 <- mice2[order(mice2$Creation.Date),] # reorder data by the day the request was made
library(ggplot2)

Next we will format the Creation.Date and Completion.Date columns so that they reflect actual date values. Here the data start out as factors, so we will convert the data to characters and then, date values.

mice2$Creation.Date <- as.character(mice2$Creation.Date)
mice2$Completion.Date <- as.character(mice2$Completion.Date)
mice2$Creation.Date <- as.Date(mice$Creation.Date,"%m/%d/%Y")
mice2$Completion.Date <- as.Date(mice$Completion.Date,"%m/%d/%Y")

We will also create two new columns from the data we already have, Creation.Day and Completion.Day. These variables will represent the actual day of the week of the Creation.Date and Completion.Date variables. Both of the new columns will be of the factor data type. A similar variable, Creation.Month, will also be made to show the actual month in which the request was made.

mice2$Creation.Day <- weekdays(mice2$Creation.Date)
mice2$Completion.Day <- weekdays(mice2$Completion.Date)
mice2$Creation.Day <- as.factor(mice2$Creation.Day)
mice2$Completion.Day <- as.factor(mice2$Completion.Day)
mice2$Creation.Month <- months(mice2$Creation.Date)
mice2$Creation.Month <- as.factor(mice2$Creation.Month)

Finally, we will create a column called Elapsed.Time which will show the amount of time elapsed, in terms of days, between the date the request was made and the date the job was completed.

mice2$Elapsed.Time <- as.numeric(difftime(mice2$Completion.Date,mice2$Creation.Date,units='days'))

Now, we’ll take a look at some plots. First, we will look at service totals for each of the community areas. Printing information for all 77 community areas may be difficult to read, so we will group the areas in terms of region of the city. To carry this out, we will utilize the ggplot2 library.

ggplot(mice2,aes(Area.Name,labels=Area.Name))+geom_bar()+facet_wrap(~Sides,ncol=3,scales='free')+theme(axis.text.x=element_text(angle=90,color='black'))+xlab('')+ylab('')

One thing to note is that, among the different regions, the North Side as three community areas, Lake View, Lincoln Park, and Logan Square, with over a thousand service requests each. Meanwhile, the South and Far Southeast Sides have only one community area each with over two hundred requests.

Let’s look at the days in which requests are logged in.

barplot(table(factor(mice2$Creation.Day,levels=c('Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday'))),main="Days when Service Requests are made")

Tuesdays seem to be slightly more popular than the rest of the days of the work week. Oddly enough, Saturday and Sunday are by far the least popular days to make requests.

Next, we will look at the days of the week in which the work was reported as completed.

barplot(table(factor(mice2$Completion.Day,levels=c('Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday'))),main="Days in which Service Requests are completed")

There is a little variation, but more than three thousand service requests are being reported as being completed during each of the five days of the work week. Also, it would be good for residents of the city to know that requests are not being fulfilled during the weekend.

We will also look at the months in which service requests are made.

barplot(table(factor(mice2$Creation.Month,levels=c('January','February','March','April','May','June','July','August','September','October','November','December'))),main="Monthly totals of created Service Requests")

The distribution of requests is very interesting. One could make the argument that, as temperatures moderate during the Spring and Summer, people increase their outdoor activities. The increase of these activities could, then, lead to more opportunities to see rodents outside. Requests would then increase with more opportunities.

Finally, we can take a look at how fast the service requests are completed. I have not been able to locate any sort of benchmark in which this type of work is to be completed. For the sake of argument, however, I will propose that within seven days, the work should be completed. We can quickly look at the percentage of service requests that were completed within seven days.

length(mice2$Elapsed.Time[mice2$Elapsed.Time <= 7])/length(mice2$Elapsed.Time)
## [1] 0.3742302

So around 37% of the requests are completed within a week. Somehow, this does not sound great. Let’s look at the distribution of elapsed time between service request creation and completion.

hist(mice2$Elapsed.Time,breaks=140,main="Elapsed time between Service Request creation and completion",xlab="Elapsed time between logged request and job completion (days)")

We can see that a large portion of the requests are being fulfilled within ten days. However, Its hard to ignore that a sizable portion of the jobs are being completed as late as two months after the original request.

We will end the discussion here, but it is informative to see when these types of requests are made and the time it takes to actually have them result in rodent baiting.