Tutorial 1: Dynamic Data Visualization
Data in Use
The present tutorial focuses on the exploration of the interactive visualization capabilities of animated graphics in R with the help of plotly library. Additionally, basic EDA is conducted prior to data visualization to assess quality of the used data and explore the dataset at work.
The
dataset is taken from the open source Kaggle Community and explores
the topic of the various US crime rates from 1975 to 2015. Historical
data is of particular importance when it comes to interactive data
visualization due to the fact that the latter is capable of producing an
animated timeline of the events and showcase how the data trends were
changing during the designated time period in the most effective and
easy-to-understand for the end-user way. The dataset has been put
together by The Marshall project and contains more than 40 years of data
on the four major crimes the FBI classifies as violent — homicide, rape,
robbery and assault — in 68 police jurisdictions with populations of
250,000 or greater.
Dataset insight
Libraries and the Dataset in use
# Disabling scientific notation
options(scipen=999)
# Loading the dataset (keeping the NA values in as for EDA)
data1 <- read.csv("report.csv")
# Quick look at the data inside the dataframe provided using different basic R commands
str(data1) # Returns names of the columns, variable type and several observations## 'data.frame': 2829 obs. of 15 variables:
## $ report_year : int 1975 1975 1975 1975 1975 1975 1975 1975 1975 1975 ...
## $ agency_code : chr "NM00101" "TX22001" "GAAPD00" "CO00101" ...
## $ agency_jurisdiction: chr "Albuquerque, NM" "Arlington, TX" "Atlanta, GA" "Aurora, CO" ...
## $ population : int 286238 112478 490584 116656 300400 642154 864100 616120 422276 262103 ...
## $ violent_crimes : int 2383 278 8033 611 1215 1259 16086 11386 3350 1937 ...
## $ homicides : int 30 5 185 7 33 25 259 119 63 68 ...
## $ rapes : int 181 28 443 44 190 137 463 453 192 71 ...
## $ assaults : int 1353 132 3518 389 463 347 6309 3036 755 976 ...
## $ robberies : int 819 113 3887 171 529 750 9055 7778 2340 822 ...
## $ months_reported : int 12 12 12 12 12 12 12 12 12 12 ...
## $ crimes_percapita : num 833 247 1637 524 404 ...
## $ homicides_percapita: num 10.48 4.45 37.71 6 10.99 ...
## $ rapes_percapita : num 63.2 24.9 90.3 37.7 63.2 ...
## $ assaults_percapita : num 473 117 717 333 154 ...
## $ robberies_percapita: num 286 100 792 147 176 ...
colnames(data1) # Returns all column names of the dataset## [1] "report_year" "agency_code" "agency_jurisdiction"
## [4] "population" "violent_crimes" "homicides"
## [7] "rapes" "assaults" "robberies"
## [10] "months_reported" "crimes_percapita" "homicides_percapita"
## [13] "rapes_percapita" "assaults_percapita" "robberies_percapita"
summary(data1) # Returns a more detailed summary for each variable in the dataset## report_year agency_code agency_jurisdiction population
## Min. :1975 Length:2829 Length:2829 Min. : 100763
## 1st Qu.:1985 Class :character Class :character 1st Qu.: 377931
## Median :1995 Mode :character Mode :character Median : 536614
## Mean :1995 Mean : 795698
## 3rd Qu.:2005 3rd Qu.: 816856
## Max. :2015 Max. :8550861
## NA's :69
## violent_crimes homicides rapes assaults
## Min. : 154 Min. : 1.0 Min. : 15.0 Min. : 15
## 1st Qu.: 3015 1st Qu.: 32.0 1st Qu.: 176.2 1st Qu.: 1467
## Median : 5136 Median : 64.0 Median : 291.0 Median : 2597
## Mean : 29632 Mean : 398.4 Mean : 416.3 Mean : 4405
## 3rd Qu.: 9058 3rd Qu.: 131.0 3rd Qu.: 465.0 3rd Qu.: 4556
## Max. :1932274 Max. :24703.0 Max. :3899.0 Max. :71030
## NA's :35 NA's :34 NA's :75 NA's :76
## robberies months_reported crimes_percapita homicides_percapita
## Min. : 83 Min. : 0.00 Min. : 16.49 Min. : 0.210
## 1st Qu.: 1032 1st Qu.:12.00 1st Qu.: 625.08 1st Qu.: 6.955
## Median : 1940 Median :12.00 Median : 949.68 Median :11.980
## Mean : 4000 Mean :11.87 Mean :1093.05 Mean :15.373
## 3rd Qu.: 3610 3rd Qu.:12.00 3rd Qu.:1409.51 3rd Qu.:20.230
## Max. :107475 Max. :12.00 Max. :4352.83 Max. :94.740
## NA's :75 NA's :137 NA's :35 NA's :34
## rapes_percapita assaults_percapita robberies_percapita
## Min. : 1.64 Min. : 1.61 Min. : 11.46
## 1st Qu.: 35.77 1st Qu.: 319.09 1st Qu.: 210.24
## Median : 55.90 Median : 487.48 Median : 374.40
## Mean : 59.31 Mean : 566.60 Mean : 459.97
## 3rd Qu.: 77.80 3rd Qu.: 728.24 3rd Qu.: 612.00
## Max. :199.30 Max. :2368.22 Max. :2337.52
## NA's :75 NA's :76 NA's :75
nlevels(as.factor(data1$agency_jurisdiction)) # Check the number of jurisdiction areas## [1] 69
nlevels(as.factor(data1$report_year)) # Check the number of years reported## [1] 41
- Data set contains a total of 2829 observations
- Data is obtained on 15 variables, 2 of which are categorical and the rest in numeric
- 41 years of observations obtained from 69 different jurisdiction areas result in 2829 total observations
Dataset in focus
We will proceed to use ‘modelsummary’ library to showcase its capabilities when it comes to summarizing data (descriptive statistics) and later on statistical models.
As it can be observed, ‘report_year’ is the only variable that does not have any missing values which is perfectly reasonable due to the dataset structure. When it comes to missing values in case of the ‘population’ variable it can be observed that the only two jurisdiction areas that have reported data with NAs for this particular variable are ‘United States’ and ‘Louisville, KY’. As for ‘Louisville, KY’ jurisdiction area, the data on population is missing all the way till 2003 (data present 2003 - 2015), while for the ‘United States’ jurisdiction area the ‘population’ data is missing entirely. The lack of population data thus results in NA cases attributed to all of the ‘per capita’-type variables since those are impossible to calculate (demonstrated using ‘homicides_percapita’ variable). However, In case of the ‘United States’ jurisdiction area on the ‘percapita’-type variables were still calculated.
As it is stated in the dataset description provided by the data source regarding the aforementioned calculation: “[The authors] calculated the rate of crime in each category and for all violent crime, per 100,000 residents in the jurisdiction, based on the FBI’s estimated population for that year. [The authors] used the 2014 estimated population to calculate 2015 crime rates per capita”.
An in-depth analysis could be conducted to dive deeper into the mentioned above jurisdiction ares due to the unusual number of missing values. The need for such is noted but not conducted for the present study due to its main focus on visualization.
# Quick summary with default parameters
datasummary_skim(data1) | Unique (#) | Missing (%) | Mean | SD | Min | Median | Max | ||
|---|---|---|---|---|---|---|---|---|
| report_year | 41 | 0 | 1995.0 | 11.8 | 1975.0 | 1995.0 | 2015.0 | |
| population | 2741 | 2 | 795698.1 | 1012450.6 | 100763.0 | 536614.5 | 8550861.0 | |
| violent_crimes | 2527 | 1 | 29632.5 | 172863.0 | 154.0 | 5135.5 | 1932274.0 | |
| homicides | 522 | 1 | 398.4 | 2281.3 | 1.0 | 64.0 | 24703.0 | |
| rapes | 879 | 3 | 416.3 | 479.8 | 15.0 | 291.0 | 3899.0 | |
| assaults | 2281 | 3 | 4405.1 | 6977.3 | 15.0 | 2597.0 | 71030.0 | |
| robberies | 2149 | 3 | 4000.2 | 8653.9 | 83.0 | 1940.0 | 107475.0 | |
| months_reported | 13 | 5 | 11.9 | 1.1 | 0.0 | 12.0 | 12.0 | |
| crimes_percapita | 2782 | 1 | 1093.0 | 676.9 | 16.5 | 949.7 | 4352.8 | |
| homicides_percapita | 1873 | 1 | 15.4 | 12.4 | 0.2 | 12.0 | 94.7 | |
| rapes_percapita | 2431 | 3 | 59.3 | 32.0 | 1.6 | 55.9 | 199.3 | |
| assaults_percapita | 2724 | 3 | 566.6 | 369.4 | 1.6 | 487.5 | 2368.2 | |
| robberies_percapita | 2707 | 3 | 460.0 | 340.9 | 11.5 | 374.4 | 2337.5 |
data1[!complete.cases(data1$population), ] %>%
distinct(agency_jurisdiction)## agency_jurisdiction
## 1 Louisville, KY
## 2 United States
# Louisville and United States jurisdiction areas, missings by year (report years are limited to those after 2001 for simplicity, but other years data has been checked prior)
data1 %>%
filter(agency_jurisdiction == 'Louisville, KY' & report_year > 2001) %>%
select(population, report_year)## population report_year
## 1 NA 2002
## 2 623771 2003
## 3 624697 2004
## 4 623735 2005
## 5 626018 2006
## 6 624030 2007
## 7 629679 2008
## 8 631260 2009
## 9 660582 2010
## 10 665152 2011
## 11 666200 2012
## 12 671120 2013
## 13 677710 2014
## 14 680550 2015
data1 %>%
filter(agency_jurisdiction == 'United States' & report_year > 2001) %>%
select(population, report_year)## population report_year
## 1 NA 2002
## 2 NA 2003
## 3 NA 2004
## 4 NA 2005
## 5 NA 2006
## 6 NA 2007
## 7 NA 2008
## 8 NA 2009
## 9 NA 2010
## 10 NA 2011
## 11 NA 2012
## 12 NA 2013
## 13 NA 2014
## 14 NA 2015
# Presence/Absence of the 'percapita'-type variables due to missing valus
data1 %>%
filter(agency_jurisdiction == 'Louisville, KY' & report_year > 2001) %>%
select(report_year, homicides, homicides_percapita)## report_year homicides homicides_percapita
## 1 2002 NA NA
## 2 2003 50 8.02
## 3 2004 66 10.57
## 4 2005 55 8.82
## 5 2006 50 7.99
## 6 2007 71 11.38
## 7 2008 71 11.28
## 8 2009 62 9.82
## 9 2010 52 7.87
## 10 2011 48 7.22
## 11 2012 62 9.31
## 12 2013 48 7.15
## 13 2014 56 8.26
## 14 2015 81 11.90
data1 %>%
filter(agency_jurisdiction == 'United States' & report_year > 2001) %>%
select(report_year, homicides, homicides_percapita)## report_year homicides homicides_percapita
## 1 2002 16229 5.6
## 2 2003 16528 5.7
## 3 2004 16148 5.5
## 4 2005 16740 5.6
## 5 2006 17309 5.8
## 6 2007 17128 5.7
## 7 2008 16465 5.4
## 8 2009 15399 5.0
## 9 2010 14722 4.8
## 10 2011 14661 4.7
## 11 2012 14827 4.7
## 12 2013 14319 4.5
## 13 2014 14164 4.4
## 14 2015 15696 4.9
Brief look at NAs
Although the data only has a limited number of NA cases (limited to no more than 5% of the total observations per variable) which are unlikely to affect the further analysis for the purposes of the present tutorial additional brief EDA into NA cases is conducted in what follows.
dataNA <- data1[!complete.cases(data1), ]
str(dataNA) # The basic dataset structure can be observed, including the total number of rows containing NAs from the original dataset that have been filtered into a new one## 'data.frame': 141 obs. of 15 variables:
## $ report_year : int 1975 1975 1976 1976 1977 1977 1978 1978 1979 1979 ...
## $ agency_code : chr "KY05680" "" "KY05680" "" ...
## $ agency_jurisdiction: chr "Louisville, KY" "United States" "Louisville, KY" "United States" ...
## $ population : int NA NA NA NA NA NA NA NA NA NA ...
## $ violent_crimes : int NA 1039710 NA 1004210 NA 1029580 NA 1085550 NA 1208030 ...
## $ homicides : int NA 20510 NA 18780 NA 19120 NA 19560 NA 21460 ...
## $ rapes : int NA NA NA NA NA NA NA NA NA NA ...
## $ assaults : int NA NA NA NA NA NA NA NA NA NA ...
## $ robberies : int NA NA NA NA NA NA NA NA NA NA ...
## $ months_reported : int NA NA NA NA NA NA NA NA NA NA ...
## $ crimes_percapita : num NA 488 NA 468 NA ...
## $ homicides_percapita: num NA 9.6 NA 8.7 NA 8.8 NA 9 NA 9.8 ...
## $ rapes_percapita : num NA NA NA NA NA NA NA NA NA NA ...
## $ assaults_percapita : num NA NA NA NA NA NA NA NA NA NA ...
## $ robberies_percapita: num NA NA NA NA NA NA NA NA NA NA ...
datasummary_skim(dataNA) # Basic stats using the same package | Unique (#) | Missing (%) | Mean | SD | Min | Median | Max | ||
|---|---|---|---|---|---|---|---|---|
| report_year | 41 | 0 | 2003.3 | 13.7 | 1975.0 | 2013.0 | 2015.0 | |
| population | 73 | 49 | 891732.2 | 1092370.1 | 191992.0 | 651480.0 | 8550861.0 | |
| violent_crimes | 107 | 25 | 552730.0 | 707929.3 | 626.0 | 9149.5 | 1932274.0 | |
| homicides | 100 | 24 | 7250.0 | 9318.6 | 8.0 | 145.0 | 24703.0 | |
| rapes | 66 | 53 | 454.5 | 433.6 | 68.0 | 346.5 | 2244.0 | |
| assaults | 66 | 54 | 3691.8 | 4365.8 | 234.0 | 2469.0 | 30546.0 | |
| robberies | 67 | 53 | 2337.8 | 2784.0 | 270.0 | 1447.5 | 16946.0 | |
| months_reported | 2 | 97 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| crimes_percapita | 107 | 25 | 683.7 | 346.0 | 88.4 | 595.2 | 1817.1 | |
| homicides_percapita | 97 | 24 | 10.7 | 10.0 | 1.2 | 8.1 | 59.3 | |
| rapes_percapita | 67 | 53 | 55.2 | 28.3 | 5.9 | 53.1 | 151.6 | |
| assaults_percapita | 66 | 54 | 437.2 | 247.4 | 35.1 | 382.6 | 1163.2 | |
| robberies_percapita | 67 | 53 | 275.3 | 177.1 | 40.2 | 222.9 | 784.3 |
As it can be seen from the basic statistics, the total number of NA cases (rows containing NA values) in the original dataset was 141 with some rows containing multiple variables with data values missing and some only having one. Variables are matched in pairs, naturally resulting in repeated NA counts / shares: e.g. share of NA cases for assaults in the new data split equals to 54 %, resulting in 54 % NA cases as identified for the ‘assaults_percapita’ variable. As it could be seen from the summary table above, the minimal value of the ‘assaults’ variable (the same applies to other variables of the same kind) amounts to 15 cases thus implying that it is rather unclear whether NA cases are input as indicators of no criminal cases registered in the measured time period in question or complete lack of any information on the present metric.
table(dataNA[dataNA$agency_jurisdiction == 'Louisville, KY' | dataNA$agency_jurisdiction == 'United States', ]$agency_jurisdiction) ##
## Louisville, KY United States
## 29 41
# Alternative way to check the number of missing cases by jurisdiction area to identify the areas that require additional analysis
# NB: Only the two areas identified earlier are filtered outAs it way mentioned above, it can be identified that agencies within ‘Louisville, KY’ or ‘United States’ jurisdiction areas are the ones contributing to the most of missing data cases (29 and 41 rows with at least one NA case detected accordingly) while the rest of the jurisdiction areas only registered number of NA cases limited up to 3. Since data collected in the above mentioned jurisdiction areas is contributing to the most of NA cases, further analysis could be potentially limited to those.
Static Visualizations
# Further visualizations are built on the dataset with NAs excluded
data2 <- na.omit(data1)
# A quick glance at what kind of a distribution it is
a1 <- ggplot(data2, aes(x = population)) +
geom_histogram(fill="darkred") +
labs(y="Frequency", x="Population") +
theme_bw()
# Taking a closer look at the most 'populated' area of the graph
a2 <- ggplot(data2, aes(x = population)) +
geom_histogram(fill="red") +
labs(y="Frequency", x="Population") +
theme_bw() +
xlim(0, 1000000)
ggarrange(a1, a2,
ncol = 2, nrow = 1)| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2688 | 793125.7 | 1010316 | 534129 | 596294.3 | 266661.9 | 100763 | 8473938 | 8373175 | 5.097839 | 30.26807 | 19486.9 |
A quick analysis of the distribution reveals that the values range from 100,763 to 8,473,938, with the mean being equal to 793,125.7. The median (534,129) value is smaller than the latter statistical measure indicating a clear right-skewness of the distribution: there are more observations from lower populated areas. That is also indicated by the positive value of the skew (5.097839). Since the kurtosis is positive, the distribution is leptokurtic (30.26807). Apart from that, the distribution concentrated in the left side of the overall histogram looks surprisingly normal and resembles the bell curve.
# Using the knowledge of the population distribution,
# the data can be divided into three approximately equal groups for the further purposes of data visualization
data2$population <- as.numeric(as.character(data2$population))
data2$populationCH <- rep(NA, length(data2$population))
data2$populationCH[data2$population < 400000] <- "Less than 400k"
data2$populationCH[data2$population > 400000 & data2$population < 650000] <- "400k - 650k"
data2$populationCH[data2$population > 650000] <- "More than 650k +"
data2$populationCH <- as.factor(data2$populationCH)
# Visualization by separate population groups
b1 <- ggplot(data2, aes(x = populationCH, fill = populationCH)) +
geom_bar() +
scale_fill_brewer(palette = "YlOrRd") +
labs(x = "Population group", y = "N", title = "Distribution of population groups") +
theme_bw() +
theme(legend.position = "none")
b2 <- ggplot(data2, aes(x = populationCH, y =..count../sum(..count..)*100, fill = populationCH)) +
geom_bar() +
scale_fill_brewer(palette = "YlOrRd") +
labs(x = "Population group", y = "%", title = "Distribution of population groups") +
theme_bw() +
theme(legend.position = "none") +
stat_count(aes(label=round(..count../sum(..count..)*100, digits = 1)), vjust=0,
geom="text", position="identity")
ggarrange(b1, b2,
ncol = 2, nrow = 1)Ideally, in the full version of the EDA similar static visualizations are to be built for all the studied variables. Such are omitted in the present tutorial.
Dynamic Visualizations
The following section of the tutorial contains various dynamic visualization plots built based the dataset in use with the sole purpose of showcasing how helpful dynamic plots can be in direct comparison to static counterparts when it comes to historic data such as the one studied in the present tutitorial. While the usage of interactive plots
- Can be updated based on actions performed by the user
- Allows to hover over an object, zoom in etc.
- Increases one’s ability to explore the story the data are telling
The inclusion of dynamic plots in data analysis reports implies that such visualizations do not require user input once the code to generate the graphics is written and change automatically.
Note that for all the graphs below markers contain information on the jurisdiction area that the data point belongs to. The rest of the variables used in the graph generation is indicated on the axis accordingly.
Visualizations showcased in what follows could be potentially used to report on the fluctuation of various crime-related trends through studied years and presented in a rather easy-to-understand visial way to various stakeholder parties.
Simple dynamic visualization with no parameters changed
#The original full dataset is used
data1 %>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(frame = ~report_year, ids = ~agency_code)%>%
layout(title = "Relationship between crime types in police jurisdiction areas: robbery and homicide",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
hide_legend()Updated slider: color and text size
a1 <- data1%>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(frame = ~report_year, ids = ~agency_code, marker = list(color = "darkred")) %>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
hide_legend()
a1 %>%
animation_slider(currentvalue = list(prefix = NULL, font = list(color = "darkred", size = 30))) Coloring according to population size as a way to add another variable to the plot
a2 <- data1%>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction)) %>%
add_markers(frame = ~report_year, ids = ~agency_code, color =~population, colors = brewer.pal(50, "YlOrRd")) %>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
hide_legend()
a2 %>%
animation_slider(currentvalue = list(prefix = NULL, font = list(color = "darkred", size = 30)))New layer of markers
data1975 <- data1 %>%
filter(report_year == 1975)
data1 %>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_text(x = 1250, y = 50, text = ~report_year, frame = ~report_year,
textfont = list(color = toRGB("gray95"), size = 200)) %>%
add_markers(data = data1975, marker = list(color = toRGB("gray20"), opacity = 0.5)) %>%
add_markers(frame = ~report_year, ids = ~agency_code, data = data1, marker = list(color = "darkred"))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
animation_slider(hide = TRUE)%>%
hide_legend()Linking plots and Filtering
Clean base plot
shared_crime <- SharedData$new(data1975)
cols <- toRGB(RColorBrewer::brewer.pal(3, "PRGn"))
p1 <- shared_crime%>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(marker = list(color = "red",size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
hide_legend()
p2 <- shared_crime%>%
plot_ly(x = ~robberies_percapita, y = ~rapes_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(marker = list(color = "darkred",size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and rape, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Rapes per capita"))%>%
hide_legend()
subplot(p1, p2, titleY = TRUE, titleX = TRUE, shareX = TRUE)%>%
hide_legend()Manual selectizer of a group
data1975N <- data2 %>%
filter(report_year == 1975)
shared_crime2 <- SharedData$new(data1975N, key =~ populationCH)
p3 <- shared_crime2 %>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
group_by(populationCH)%>%
add_markers(marker = list(color = "red",size = 15)) %>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita"))%>%
hide_legend()
p4 <- shared_crime2%>%
plot_ly(x = ~robberies_percapita, y = ~rapes_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(marker = list(color = "darkred", size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and rape, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Rapes per capita"))%>%
hide_legend()
subplot(p3, p4, titleY = TRUE, titleX = TRUE, shareX = TRUE) %>%
hide_legend() Filtering using a slider
p5 <- shared_crime %>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction))%>%
add_markers(marker = list(color = "red", size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide, 1975",
xaxis = list(title = "Robberies per capita", range = c(0, 1500)),
yaxis = list(title = "Homicide per capita"), range = c(0, 50)) %>%
hide_legend()
p6 <- shared_crime%>%
plot_ly(x = ~robberies_percapita, y = ~rapes_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction)) %>%
add_markers(marker = list(color = "darkred", size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and rape, 1975",
xaxis = list(title = "Robberies per capita", range = c(0, 1500)),
yaxis = list(title = "Rapes per capita"), range = c(0, 100)) %>%
hide_legend()
bscols(list(p5, p6, filter_slider(id = "robberies_percapita", label = "Robberies", sharedData = shared_crime, column = ~robberies_percapita)))Filtering using a checkbox
shared_crime2 <- SharedData$new(data1975N, key =~ populationCH)
p7 <- shared_crime2 %>%
plot_ly(x = ~robberies_percapita, y = ~homicides_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction), color =~ populationCH)%>%
add_markers(marker = list(color = "red", size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and homicide, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Homicides per capita")) %>%
hide_legend()
p8 <- shared_crime2 %>%
plot_ly(x = ~robberies_percapita, y = ~rapes_percapita, hoverinfo = "text",
text = ~paste("Agency jurisdiction:", agency_jurisdiction), color =~ populationCH)%>%
add_markers(marker = list(color = "darkred", size = 15))%>%
layout(title = "Relationship between crimes in police jurisdiction areas: robbery and rape, 1975",
xaxis = list(title = "Robberies per capita"),
yaxis = list(title = "Rapes per capita"))%>%
hide_legend()
bscols(list(p7, p8, filter_checkbox(id = "populationCH", label = "Area population", sharedData = shared_crime2, group =~ populationCH)))