For this project, I really want to work with a dataset I am familiar with. Full disclosure, for my IS606 class I am using a particular dateset:
“https://github.com/fivethirtyeight/data/tree/master/police-killings”
This data set combines two sources. The first being the Guardian’s website known as the Counted:
http://www.theguardian.com/us-news/ng-interactive/2015/jun/01/the-counted-map-us-police-killings
And Census Data from the American Community Survey, which was overlayed on to the Counted data set using Google Maps (much more problematic). The dataset only extends for the first 6 month of 2015, and I will be using only that github dataset for my IS606 class to analyse the correlation between police shootings and income levels/college education etc.
For this project -and this is a bit fluid and flexible at the moment- I want to compare the Guardian’s data to other fields (not just census data as that dataset did). So, I am going to take as best I can (and I think I’ll probably work with a 11/30 cut-off for data), all the up to date Guardian data to answer the following question:
Based on location/date, what factors correlate to Police Related Deaths?
It is a heavy handed question, one that I doubt I’ll be able to handle fully or completely, but something that is both a prevelant issue for us as a society. With the abundance of crowd-sourced data (especially in this turbulent time in our society), it has made data more readily available. It is my goal to see if I can link this crowd-sourced data into one database, possibly create a shiny app (ambitious maybe?) and see if there is any justifiable link between some (if not all) of the following; gun data, police force size, population data, police spending etc.
And there is alot of it! At the moment, I am still working on what datasources I wanted to pull from, but I have at least one focus, and that is “The Counted” Data Which will be my primary source. My goal with the shiny app (again MAYBE???? Depends on time), would be to create a link between the Counted data so I have up to data linkage.
Other sources that I have found:
Police Force Size Data: http://www.governing.com/gov-data/safety-justice/police-officers-per-capita-rates-employment-for-city-departments.html
Gun Sales Data: http://www.theguardian.com/news/datablog/2012/dec/17/how-many-guns-us#state
Another possible piece to look at would be real-time data. Comparing real-time data would be much more difficult, as the availability is not universal. At this point, I have yet to come up with anything concrete, but I am still looking for some source that I can match day for day to the Counted Data. More on this in the coming weeks (also, any suggestions would be helpful.)
The most difficult part of this comparision is finding comparable data. Unfortunately (and I know this working a city job), data is DELAYED. Most government entities release data months, if not YEARS after the fact, and no real constantly updated source is available. Which means, I will most likely be performing analysis on “antiquated data.” I believe this will be an argument against the validity of any findings I have, but as this is more for educational purposes, I think this will be reasonable for this project.