DDP-Shiny Application and Reproducible Pitch

Suleman Wadur

06 February, 2018

Introduction

This presentation is part of the “Developing Data Products” class of Data Science Specilization.

This presentation will use Shiny App to create a reactive plot showing the number of US arrests within the 50 states in 1973. The stats contains information for Murder, Assault and Rape crimes per 100,000 residents for each state.

The app will allow user to select a stat for a particular state, crime, or view all. The stat is then presented in a simple Scatter plot.

Link to Shiny App: https://swadur.shinyapps.io/1973-US-Crime-Rates/

How to run app:

About the data

The raw data is part of the default R datasets. It contains the number of arrests per crime within the 50 US states in 1973 per 100,000 residents.

str(USArrests)
## 'data.frame':    50 obs. of  4 variables:
##  $ Murder  : num  13.2 10 8.1 8.8 9 7.9 3.3 5.9 15.4 17.4 ...
##  $ Assault : int  236 263 294 190 276 204 110 238 335 211 ...
##  $ UrbanPop: int  58 48 80 50 91 78 77 72 80 60 ...
##  $ Rape    : num  21.2 44.5 31 19.5 40.6 38.7 11.1 15.8 31.9 25.8 ...
summary(USArrests)
##      Murder          Assault         UrbanPop          Rape      
##  Min.   : 0.800   Min.   : 45.0   Min.   :32.00   Min.   : 7.30  
##  1st Qu.: 4.075   1st Qu.:109.0   1st Qu.:54.50   1st Qu.:15.07  
##  Median : 7.250   Median :159.0   Median :66.00   Median :20.10  
##  Mean   : 7.788   Mean   :170.8   Mean   :65.54   Mean   :21.23  
##  3rd Qu.:11.250   3rd Qu.:249.0   3rd Qu.:77.75   3rd Qu.:26.18  
##  Max.   :17.400   Max.   :337.0   Max.   :91.00   Max.   :46.00

Data Prep

In order to prep the raw data, I also use the “state” dataset in R in order to assign the state codes. Also, I exclude the variable “UrbanPop”, which is the percentage of Urban Population in the dataset

ArrestData <- USArrests[,c(1,2,4)]
ArrestData$Abbv <- c("Abbv")
ArrestData$Abbv <- state.abb

str(ArrestData)
## 'data.frame':    50 obs. of  4 variables:
##  $ Murder : num  13.2 10 8.1 8.8 9 7.9 3.3 5.9 15.4 17.4 ...
##  $ Assault: int  236 263 294 190 276 204 110 238 335 211 ...
##  $ Rape   : num  21.2 44.5 31 19.5 40.6 38.7 11.1 15.8 31.9 25.8 ...
##  $ Abbv   : chr  "AL" "AK" "AZ" "AR" ...

Sample records from the prepared dataset

head(ArrestData)
##            Murder Assault Rape Abbv
## Alabama      13.2     236 21.2   AL
## Alaska       10.0     263 44.5   AK
## Arizona       8.1     294 31.0   AZ
## Arkansas      8.8     190 19.5   AR
## California    9.0     276 40.6   CA
## Colorado      7.9     204 38.7   CO

Sample Plot

To plot the data, I also had to regroup the data in order to allow multiple crimes to be plotted by State. Here’s the code snippet and map.
Use Chrome browser if plot doesn’t appear in Firefox.

library(plotly)

df <- reshape2::melt(ArrestData, id.var = 'Abbv')

#define label show/hide options
xlabel <- "States"
ylabel <- "Number of arrest <br />per 100,000 people"

#define legend style
l <- list(
  font = list(
    family = "sans-serif",
    size = 12,
    color = "#000"),
  bgcolor = "#D3D3D3",
  orientation = 'h',
  showlegend = TRUE
  )
      
#render plot and assign to variable
p <- plot_ly(data = df, x = ~Abbv, y = ~value, color = ~variable, type = "scatter", mode = "markers") %>% 
  layout(title = 'US Arrests by Crimes for 50 states in 1973',
         xaxis = list(title = xlabel), 
         yaxis = list(title = ylabel),
        legend = l
  )
p