1 Map 1: Randomly Selected Gas Stations

1.1 Overview

Our first visualization will be an interactive map, containing various locations of gas stations from the contential United States. Each gas station will be a point on the map, and each point will have a hover text box containing state, county, address, and zip code.

1.2 Randomly Selecting Data

Since our source data contains nearly 73,000 observations, we will randomly select 500 observations.

gas<-read.csv("https://pengdsci.github.io/datasets/POC/POC.csv")

rand.df <- gas[sample(nrow(gas), size=500), ]

1.3 Mapping the Data

We will use the plotly package to map the data onto an interactive map.

g <- list(      scope = 'p',
                projection = list(type = 'albers usa'),
                showland = TRUE,
                landcolor = toRGB("gray95"),
                subunitcolor = toRGB("gray85"),
                countrycolor = toRGB("gray85"),
                countrywidth = 0.5,
                subunitwidth = 0.5
)


fig <- plot_geo(rand.df, lat = ~ycoord, lon = ~xcoord) %>% 
  add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew,
                             
                             sep = "<br>"),
               #color = , 
               symbol = "circle", 
               #size = , 
               hoverinfo = "text")   %>% 
  layout( title = 'Randomly Selected US Gas Stations', 
          geo = g )

fig

The above is a demonstration of a simple - yet effective - way to visualize data on a map. Our next will be a little more complex.

2 Map 2: Crime in Philadelphia, 2023

2.1 Overview

For this visualization, we will begin with a dataset of crimes committed in the city of Philadelphia between 2015 and early March of 2024.

2.2 Preparing the Data

We will need to subset the 2023 data before we can impose it over a map. We will use the stringr library for this.

crime<-read.csv("https://pengdsci.github.io/STA553VIZ/w08/PhillyCrimeSince2015.csv")

df<-crime

df$year <- str_extract(df$date, "\\d{4}")
df$year <- as.numeric(df$year)

crime23<-subset(df, year==2023)

write.csv(crime23, "C:\\Users\\Alex\\Documents\\R\\Grad\\553\\datasets\\wk7.csv")

A copy of the 2023 data can be found at https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw7/wk7.csv

#remove observations with missing values - at least one has a missing value for coordinates
crime23.nona<-na.omit(crime23)

2.3 Mapping the Data

Finally, we will map the incident data using the leaflet package.

color2 <- rep("red", length(crime23.nona))
color2[which(crime23.nona$fatal=="Nonfatal")] <- "blue"
color2[which(crime23.nona$fatal=="Fatal")] <- "red"



label.msg <- paste("Street:", crime23.nona$street_name,    
                   "<br>Block Number:",crime23.nona$block_number,
                   "<br>Neighborhood:", crime23.nona$neighborhood,
                   "<br>Incident Type:", crime23.nona$fatal)


leaflet(crime23.nona) %>%
  addTiles() %>% 
  setView(lng=mean(crime23.nona$lng), lat=mean(crime23.nona$lat), zoom = 11) %>%
  addProviderTiles(providers$Esri.WorldGrayCanvas) %>%
  addCircleMarkers(
            ~lng, 
            ~lat,
            color = color2,
            stroke = FALSE, 
            fillOpacity = 0.5,
            popup= ~label.msg)  %>%
  addLegend(position = "bottomright", 
            colors = c("red", "blue"),
            labels= c("Fatal", "Nonfatal"),
            title= "Type of Incident",
            opacity = 0.4)

Our resulting graph is fully interactive - clicking a dot will show details of the incident.

Please note that the above graph has spots that appear purple. This is due to the opaque, overlapping red and blue dots, indicating both fatal and nonfatal incidents at the same address. For example, 1000 E Bristol Street in Juniata saw an incident with two victims, one being a fatality. For clarification on any confusing point, please refer to the dataset linked at the end of Preparing the Data.

3 Map 3: Philadelphia Shootings, 2015-2024

3.1 Overview

Our next visualization will specifically focus on the demographics of shooting victims. The data is from OpenDataPhilly and can be accessed here:https://opendataphilly.org/datasets/shooting-victims (see below code for raw, direct link). We will again utilize leaflet.

philly.data<-read.csv("https://phl.carto.com/api/v2/sql?q=SELECT+*,+ST_Y(the_geom)+AS+lat,+ST_X(the_geom)+AS+lng+FROM+shootings&filename=shootings&format=csv&skipfields=cartodb_id")
phillyNeighborShooting  <- na.omit(st_read("https://pengdsci.github.io/STA553VIZ/w08/PhillyShootings.geojson"))
Reading layer `PhillyShootings' from data source 
  `https://pengdsci.github.io/STA553VIZ/w08/PhillyShootings.geojson' 
  using driver `GeoJSON'
replacing null geometries with empty geometries
Simple feature collection with 15555 features and 21 fields (with 29 geometries empty)
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -75.27362 ymin: 39.87799 xmax: -74.95936 ymax: 40.13117
Geodetic CRS:  WGS 84
phillyNeighbor  <- st_read("https://pengdsci.github.io/STA553VIZ/w08/Neighborhoods_Philadelphia.geojson")
Reading layer `Neighborhoods_Philadelphia' from data source 
  `https://pengdsci.github.io/STA553VIZ/w08/Neighborhoods_Philadelphia.geojson' 
  using driver `GeoJSON'
Simple feature collection with 158 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -75.28027 ymin: 39.867 xmax: -74.95576 ymax: 40.13799
Geodetic CRS:  WGS 84
philly  <- st_read("https://pengdsci.github.io/STA553VIZ/w08/PhillyNeighborhood-blocks.geojson")
Reading layer `PhillyNeighborhood-blocks' from data source 
  `https://pengdsci.github.io/STA553VIZ/w08/PhillyNeighborhood-blocks.geojson' 
  using driver `GeoJSON'
Simple feature collection with 17555 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -75.28027 ymin: 39.867 xmax: -74.95576 ymax: 40.13799
Geodetic CRS:  WGS 84

3.2 Preparing the Data

While the data is usable, we intend to display much more information with each point, so we will remove unnecessary or redundant data, while fixing a formatting quirk with the ‘date_’ variable (which will also be renamed to ‘date’) using the stringr package. Without this, the ‘date_’ variable ends each entry with ‘00:00:00+00’, which is clunky and likely unintended.

philly.data2 <- philly.data %>%
  select(-c(1, 2, 3, 5, 6, 17, 18))

names(philly.data2)[which(names(philly.data2) == "date_")] <- "date"

philly.data2$date <- str_replace(philly.data2$date, " 00:00:00\\+00", "")

# Convert 'date' variable to Date format for possible future use
philly.data2$date <- as.Date(philly.data2$date, format = "%Y-%m-%d")

3.3 Preparing Visualization for Aggregate Data

With out map, we would like to feature individual incidents and allow someone to click to see victim demographic information, but we would also like to have ‘big picture’ aggregate plots available as well. We will prepare three simple plots here using ggplot for illustrative purposes.

#Distribution of Age across Race

ar.plot<-ggplot(philly.data2, aes(x = age, fill = race)) +
  geom_density(alpha = 0.5) +
  labs(title = "Distribution of Age by Race",
       x = "Age",
       y = "Density") +
  scale_fill_discrete(name = "Race") +
  theme_minimal()

#Probably unnecessary step, but made an extra dataset while testing something out
philly.data3<-subset(philly.data2)
philly.data3$fatal <- factor(philly.data3$fatal, levels = c(0, 1), labels = c("Nonfatal", "Fatal"))

# Plot the point graph with colored points
year.plot <- ggplot(philly.data3, aes(x = year, fill = fatal)) +
  geom_bar(position = position_dodge(width = 0.5), stat = "count") +
  scale_fill_manual(values = c("Nonfatal" = "blue", "Fatal" = "red")) +
  labs(title = "Number of Fatal and Nonfatal Incidents by Year",
       x = "Year",
       y = "Count") +
  theme_minimal() +
  scale_x_continuous(breaks = seq(min(philly.data3$year), max(philly.data3$year), by = 1))

#Frequency of indoor vs outdoor incidents

outside.plot <- ggplot(philly.data2, aes(x = year, fill = factor(outside, labels = c("Indoor", "Outdoor")))) +
  geom_bar(stat = "count") +
  labs(title = "Number of Indoor and Outdoor Incidents by Year",
       x = "Year",
       y = "Count",
       fill = "Location") +
  scale_fill_manual(values = c("Indoor" = "darkorchid", "Outdoor" = "darkgreen")) +
  scale_x_continuous(breaks = seq(min(philly.data2$year), max(philly.data2$year), by = 1)) +
  theme_minimal()

We have exported and uploaded these images separately and will now redefine them:

ar="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/arplot.png"

out="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/outsideplot2.png"

yr="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/yearplot2.png"

3.4 Mapping the Data

Our final visualization will contain a map of shooting victims, fully interactive with demographic information, as well as additional, big-picture data available.

#defining things
pal <- colorFactor(c("blue", "red"), domain = c(0, 1))


ageraceplot = st_as_sf(data.frame(x = -75.4077, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
yearplot = st_as_sf(data.frame(x = -75.3877, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
outdoorplot = st_as_sf(data.frame(x = -75.3677, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
fig <- plot_ly(philly.data2, x = ~lng, y = ~lat, 
               type = 'scatter', 
               mode = 'markers', 
                              marker = list(symbol = 'circle', 
                           sizemode = 'diameter',
                               line = list(width = 2, color = '#FFFFFF')))
              
tag.map.title <- tags$style(HTML("
               .leaflet-control.map-title {
                   transform: translate(50%,50%);
                   position: fixed !important;
                   left: 50%;
                   text-align: center;
                   padding-left: 10px;
                   padding-right: 10px;
                   background: transparent;
                   font-weight: bold;
                   font-size: 18px;}
                 "))

rr <- tags$div(
   HTML('<img border="0" alt="ImageTitle" src="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/map%20title.png" width="200" height="45">')
 ) 

### 
leaflet() %>%
  setView(lng=-75.190429, lat=40.0039, zoom = 10.5) %>%
  addProviderTiles(providers$CartoDB.DarkMatter, group="Dark") %>%
  addProviderTiles(providers$CartoDB.DarkMatterNoLabels, group="DarkLabel") %>%  
  addProviderTiles(providers$Esri.NatGeoWorldMap, group="Esri") %>%
  addControl(rr, position = "topleft", className="map-title") %>%
  ## mini reference map
  addMiniMap() %>%
  ## neighborhood boundary
  addPolygons(data = phillyNeighbor,
              color = 'skyblue',
              weight = 1)  %>%
  
    
  addCircleMarkers(data = ageraceplot, 
                   color = "white",
                   weight = 2,
                   label = "Distribution of Age by Race",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "ageraceplot") %>%
  addPopupImages(ar, 
                  width = 500,
                  height = 320,
                  tooltip = FALSE,
                  group = "ageraceplot") %>%
    
 addCircleMarkers(data = yearplot, 
                   color = "skyblue",
                   weight = 2,
                   label = "Incident Rate by Year",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "yearplot") %>%
  addPopupImages(yr, 
                  width = 500,
                  height = 320,
                  tooltip = FALSE,
                  group = "yearplot") %>%

  
  addCircleMarkers(data = outdoorplot, 
                   color = "darkgreen",
                   weight = 2,
                   label = "Rate of Indoor vs Outdoor Incidents",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "outdoorplot") %>%
  addPopupImages(out, 
                   width = 500,
                  height = 320,
                   group = "outdoorplot" ) %>%


 
  
  ## plot information on the map
  addCircleMarkers(data = philly.data2,
                   color = ~pal(as.factor(fatal)),
                   stroke = FALSE, 
                   fillOpacity = 0.5,
                   popup = ~popupTable(philly.data2),
                   clusterOptions = markerClusterOptions(maxClusterRadius = 40)) %>%

  
          addLayersControl(baseGroups = c('Dark', 'DarkLabel', 'Esri'),
                   overlayGroups = c("Crime Data"),
                   options = layersControlOptions(collapsed = TRUE)) %>%
  ##
  browsable()

The above map is fully interactive, maintains our red/blue fatal/nonfatal color coding from the previous example, but allows us to provide much more information for each incident, as well as include aggregate data visualizations (by clicking on the dots blocking out Media on the map).

4 Map 4: Mapping Data with Tableau

4.1 Overview

Our dataset contains presidential election data for each county in the continental United States. Our goal is to create an interactive map in Tableau that displays county-level information for each presidential election from 2000-2020. First, we must import the data. We have been instructed to only consider the two major American political parties, Republican and Democrat, for this visualization.

##Voting Data
vote<-read.csv("https://pengdsci.github.io/datasets/countypresidential_election_2000-2020.csv")
fips<-read.csv("https://pengdsci.github.io/datasets/fips2geocode.csv")

colnames(fips)[1]<-"county_fips"

##subset data
vote.min <- subset(vote, party %in% c("REPUBLICAN", "DEMOCRAT"), 
                    select = c("state_po", "county_name", "candidate", "county_fips", "party", "candidatevotes", "year"))

4.2 Preparing the Data

Using dplyr and tidyr packages, we are: + Counting all votes in a given county and year - being mindful of state, because multiple counties in different states share a name + Removing the losing candidate’s data for a given county and year - still mindful of state + Creating a variable for the winning candidate’s percentage of the vote + Merging our finished dataset with the data about each county

total_votes <- vote.min %>%
  group_by(state_po, county_name, year) %>%
  summarise(total_votes = sum(candidatevotes))


winning_party_data2 <- vote.min %>%
  group_by(state_po, county_name, year) %>%
  filter(candidatevotes == max(candidatevotes)) %>%
  ungroup() %>%
  left_join(total_votes, by = c("state_po", "county_name", "year")) %>%
  mutate(winning_percentage = candidatevotes / total_votes * 100)


vote3.merge <- merge(winning_party_data2, fips, by = "county_fips", all.x = TRUE)

4.3 Mapping the Data with Tableau

Tableau is very intuitive and allows one to map data by clicking and dragging desired variables to certain function (ie, displaying the Winning Party variable as ‘color’). The downside of this being: there is no code to share!

The Tableau map is embedded below. It is interactive as well - mousing over a county should give you its election information. The year can be adjusted to see data from each presidential election from 2000 to 2020.

---
title: "Interactive Maps"
author: "Alex Dragonetti"
date: "3-25-2024"
output:
  html_document: 
    toc: yes
    toc_float: yes
    number_sections: yes
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: true
    theme: lumen
editor_options:
  chunk_output_type: inline
---
```{=html}

<style type="text/css">

/* Cascading Style Sheets (CSS) is a stylesheet language used to describe the presentation of a document written in HTML or XML. it is a simple mechanism for adding style (e.g., fonts, colors, spacing) to Web documents. */

h1.title {  /* Title - font specifications of the report title */
  font-size: 24px;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}
h4.author { /* Header 4 - font specifications for authors  */
  font-size: 20px;
  font-family: system-ui;
  color: DarkRed;
  text-align: center;
}
h4.date { /* Header 4 - font specifications for the date  */
  font-size: 18px;
  font-family: system-ui;
  color: DarkBlue;
  text-align: center;
}
h1 { /* Header 1 - font specifications for level 1 section title  */
    font-size: 22px;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: center;
}
h2 { /* Header 2 - font specifications for level 2 section title */
    font-size: 20px;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - font specifications of level 3 section title  */
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - font specifications of level 4 section title  */
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

body { background-color:white; }

.highlightme { background-color:yellow; }

p { background-color:white; }

</style>
```

```{r setup, include=FALSE}
# Detect, install, and load packages if needed.
if (!require("tidyverse")) {
   install.packages("tidyverse")
   library(tidyverse)
}
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("sf")) {
   install.packages("sf")
   library(sf)
}
if (!require("terra")) {
   install.packages("terra")
   library(terra)
}
if (!require("plotly")) {
   install.packages("plotly")
   library(plotly)
}
if (!require("dplyr")) {
   install.packages("dplyr")
   library(dplyr)
}
if (!require("png")) {
    install.packages("png")             
    library("png")
}
if (!require("spData")) {
    install.packages("spData")             
    library("spData")
}
if (!require("colourpicker")) {
    install.packages("colourpicker")              
    library("colourpicker")
}
if (!require("gifski")) {
    install.packages("gifski")              
    library("gifski")
}
if (!require("magick")) {
    install.packages("magick")              
    library("magick")
}
if (!require("spDataLarge")) {
    install.packages("spDataLarge")              
    library("spDataLarge")
}
### ggplot and extensions
if (!require("ggplot2")) {
    install.packages("ggplot2")              
    library("ggplot2")
}
if (!require("gganimate")) {
    install.packages("gganimate")              
    library("gganimate")
}
if (!require("tmap")) {
    install.packages("tmap")              
    library("tmap")
}
if (!require("sf")) {
    install.packages("sf")              
    library("sf")
}
if (!require("tigris")) {
    install.packages("tigris")              
    library("tigris")
}
if (!require("mapview")) {
    install.packages("mapview")              
    library("mapview")
}
if (!require("pander")) {
    install.packages("pander")              
    library("pander")
}
if (!require("lattice")) {
    install.packages("lattice")
library("lattice")
}
if (!require("sp")) {
    install.packages("sp")
library("sp")
}
if (!require("leaflet")) {
    install.packages("leaflet")
library("leaflet")
}
if (!require("leafpop")) {
    install.packages("leafpop")
library("leafpop")
}
if (!require("leafem")) {
    install.packages("leafem")
library("leafem")
}
if (!require("spDataLarge")) {
    install.packages("spDataLarge", repos = "https://geocompr.r-universe.dev")
library("spDataLarge")
}
if (!require("htmlwidgets")) {
    install.packages("htmlwidgets")
library("htmlwidgets")
}
if (!require("leaflet.extras")) {
    install.packages("leaflet.extras")
library("leaflet.extras")
}
if (!require("htmltools")) {
    install.packages("htmltools")
library("htmltools")
}
if(!require("png")){
  install.packages("png")
  library(png)
}
if(!require("viridis")){
  install.packages("viridis")
  library(viridis)
}
if(!require("ggmap")){
  install.packages("ggmap")
  library(ggmap)
}
if(!require("webshot")){
  install.packages("webshot")
  library(webshot)
}
if(!require("htmlwidgets")){
  install.packages("htmlwidgets")
  library(htmlwidgets)
}
if(!require("animation")){
  install.packages("animation")
  library(animation)
}
if(!require("gifski")){
  install.packages("gifski")
  library(gifski)
}
if(!require("htmlTable")){
  install.packages("htmlTable")
  library(htmlTable)
}
if(!require("magrittr")){
  install.packages("magrittr")
  library(magrittr)
}
# Specifications of outputs of code in code chunks
knitr::opts_chunk$set(echo = TRUE,       
                      warning = FALSE,   
                      result = TRUE,   
                      message = FALSE,
                      comment = NA)
```



# Map 1: Randomly Selected Gas Stations


## Overview


Our first visualization will be an interactive map, containing various locations of gas stations from the contential United States. Each gas station will be a point on the map, and each point will have a hover text box containing state, county, address, and zip code.


## Randomly Selecting Data


Since our source data contains nearly 73,000 observations, we will randomly select 500 observations.

```{r}
gas<-read.csv("https://pengdsci.github.io/datasets/POC/POC.csv")

rand.df <- gas[sample(nrow(gas), size=500), ]
```


## Mapping the Data


We will use the `plotly` package to map the data onto an interactive map.


```{r}
g <- list(      scope = 'p',
                projection = list(type = 'albers usa'),
                showland = TRUE,
                landcolor = toRGB("gray95"),
                subunitcolor = toRGB("gray85"),
                countrycolor = toRGB("gray85"),
                countrywidth = 0.5,
                subunitwidth = 0.5
)


fig <- plot_geo(rand.df, lat = ~ycoord, lon = ~xcoord) %>% 
  add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew,
                             
                             sep = "<br>"),
               #color = , 
               symbol = "circle", 
               #size = , 
               hoverinfo = "text")   %>% 
  layout( title = 'Randomly Selected US Gas Stations', 
          geo = g )

fig
```


The above is a demonstration of a simple - yet effective - way to visualize data on a map. Our next will be a little more complex.



# Map 2: Crime in Philadelphia, 2023


## Overview


For this visualization, we will begin with a dataset of crimes committed in the city of Philadelphia between 2015 and early March of 2024.


## Preparing the Data

We will need to subset the 2023 data before we can impose it over a map. We will use the `stringr` library for this.

```{r}
crime<-read.csv("https://pengdsci.github.io/STA553VIZ/w08/PhillyCrimeSince2015.csv")

df<-crime

df$year <- str_extract(df$date, "\\d{4}")
df$year <- as.numeric(df$year)

crime23<-subset(df, year==2023)

write.csv(crime23, "C:\\Users\\Alex\\Documents\\R\\Grad\\553\\datasets\\wk7.csv")
```

A copy of the 2023 data can be found at https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw7/wk7.csv

```{r}
#remove observations with missing values - at least one has a missing value for coordinates
crime23.nona<-na.omit(crime23)

```



## Mapping the Data


Finally, we will map the incident data using the `leaflet` package.

```{r}
color2 <- rep("red", length(crime23.nona))
color2[which(crime23.nona$fatal=="Nonfatal")] <- "blue"
color2[which(crime23.nona$fatal=="Fatal")] <- "red"



label.msg <- paste("Street:", crime23.nona$street_name,    
                   "<br>Block Number:",crime23.nona$block_number,
                   "<br>Neighborhood:", crime23.nona$neighborhood,
                   "<br>Incident Type:", crime23.nona$fatal)


leaflet(crime23.nona) %>%
  addTiles() %>% 
  setView(lng=mean(crime23.nona$lng), lat=mean(crime23.nona$lat), zoom = 11) %>%
  addProviderTiles(providers$Esri.WorldGrayCanvas) %>%
  addCircleMarkers(
            ~lng, 
            ~lat,
            color = color2,
            stroke = FALSE, 
            fillOpacity = 0.5,
            popup= ~label.msg)  %>%
  addLegend(position = "bottomright", 
            colors = c("red", "blue"),
            labels= c("Fatal", "Nonfatal"),
            title= "Type of Incident",
            opacity = 0.4)
```

Our resulting graph is fully interactive - clicking a dot will show details of the incident. 

Please note that the above graph has spots that appear purple. This is due to the opaque, overlapping red and blue dots, indicating both fatal and nonfatal incidents at the same address. For example, 1000 E Bristol Street in Juniata saw an incident with two victims, one being a fatality. For clarification on any confusing point, please refer to the dataset linked at the end of `Preparing the Data`.



# Map 3: Philadelphia Shootings, 2015-2024


## Overview

Our next visualization will specifically focus on the demographics of shooting victims. The data is from OpenDataPhilly and can be accessed here:https://opendataphilly.org/datasets/shooting-victims (see below code for raw, direct link). We will again utilize `leaflet`.


```{r}

philly.data<-read.csv("https://phl.carto.com/api/v2/sql?q=SELECT+*,+ST_Y(the_geom)+AS+lat,+ST_X(the_geom)+AS+lng+FROM+shootings&filename=shootings&format=csv&skipfields=cartodb_id")
phillyNeighborShooting  <- na.omit(st_read("https://pengdsci.github.io/STA553VIZ/w08/PhillyShootings.geojson"))
phillyNeighbor  <- st_read("https://pengdsci.github.io/STA553VIZ/w08/Neighborhoods_Philadelphia.geojson")
philly  <- st_read("https://pengdsci.github.io/STA553VIZ/w08/PhillyNeighborhood-blocks.geojson")
```


## Preparing the Data

While the data is usable, we intend to display much more information with each point, so we will remove unnecessary or redundant data, while fixing a formatting quirk with the 'date_' variable (which will also be renamed to 'date') using the `stringr` package. Without this, the 'date_' variable ends each entry with '00:00:00+00', which is clunky and likely unintended.

```{r}
philly.data2 <- philly.data %>%
  select(-c(1, 2, 3, 5, 6, 17, 18))

names(philly.data2)[which(names(philly.data2) == "date_")] <- "date"

philly.data2$date <- str_replace(philly.data2$date, " 00:00:00\\+00", "")

# Convert 'date' variable to Date format for possible future use
philly.data2$date <- as.Date(philly.data2$date, format = "%Y-%m-%d")
```



## Preparing Visualization for Aggregate Data


With out map, we would like to feature individual incidents and allow someone to click to see victim demographic information, but we would also like to have 'big picture' aggregate plots available as well. We will prepare three simple plots here using `ggplot` for illustrative purposes.


```{r}

#Distribution of Age across Race

ar.plot<-ggplot(philly.data2, aes(x = age, fill = race)) +
  geom_density(alpha = 0.5) +
  labs(title = "Distribution of Age by Race",
       x = "Age",
       y = "Density") +
  scale_fill_discrete(name = "Race") +
  theme_minimal()

#Probably unnecessary step, but made an extra dataset while testing something out
philly.data3<-subset(philly.data2)
philly.data3$fatal <- factor(philly.data3$fatal, levels = c(0, 1), labels = c("Nonfatal", "Fatal"))

# Plot the point graph with colored points
year.plot <- ggplot(philly.data3, aes(x = year, fill = fatal)) +
  geom_bar(position = position_dodge(width = 0.5), stat = "count") +
  scale_fill_manual(values = c("Nonfatal" = "blue", "Fatal" = "red")) +
  labs(title = "Number of Fatal and Nonfatal Incidents by Year",
       x = "Year",
       y = "Count") +
  theme_minimal() +
  scale_x_continuous(breaks = seq(min(philly.data3$year), max(philly.data3$year), by = 1))

#Frequency of indoor vs outdoor incidents

outside.plot <- ggplot(philly.data2, aes(x = year, fill = factor(outside, labels = c("Indoor", "Outdoor")))) +
  geom_bar(stat = "count") +
  labs(title = "Number of Indoor and Outdoor Incidents by Year",
       x = "Year",
       y = "Count",
       fill = "Location") +
  scale_fill_manual(values = c("Indoor" = "darkorchid", "Outdoor" = "darkgreen")) +
  scale_x_continuous(breaks = seq(min(philly.data2$year), max(philly.data2$year), by = 1)) +
  theme_minimal()

```

We have exported and uploaded these images separately and will now redefine them:

```{r}

ar="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/arplot.png"

out="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/outsideplot2.png"

yr="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/yearplot2.png"

```


## Mapping the Data


Our final visualization will contain a map of shooting victims, fully interactive with demographic information, as well as additional, big-picture data available.


```{r}
#defining things
pal <- colorFactor(c("blue", "red"), domain = c(0, 1))


ageraceplot = st_as_sf(data.frame(x = -75.4077, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
yearplot = st_as_sf(data.frame(x = -75.3877, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
outdoorplot = st_as_sf(data.frame(x = -75.3677, y = 39.9168),
                coords = c("x", "y"),
                crs = 4326)
```

```{r}
fig <- plot_ly(philly.data2, x = ~lng, y = ~lat, 
               type = 'scatter', 
               mode = 'markers', 
                              marker = list(symbol = 'circle', 
                           sizemode = 'diameter',
                               line = list(width = 2, color = '#FFFFFF')))
              
tag.map.title <- tags$style(HTML("
               .leaflet-control.map-title {
                   transform: translate(50%,50%);
                   position: fixed !important;
                   left: 50%;
                   text-align: center;
                   padding-left: 10px;
                   padding-right: 10px;
                   background: transparent;
                   font-weight: bold;
                   font-size: 18px;}
                 "))

rr <- tags$div(
   HTML('<img border="0" alt="ImageTitle" src="https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw8/map%20title.png" width="200" height="45">')
 ) 

### 
leaflet() %>%
  setView(lng=-75.190429, lat=40.0039, zoom = 10.5) %>%
  addProviderTiles(providers$CartoDB.DarkMatter, group="Dark") %>%
  addProviderTiles(providers$CartoDB.DarkMatterNoLabels, group="DarkLabel") %>%  
  addProviderTiles(providers$Esri.NatGeoWorldMap, group="Esri") %>%
  addControl(rr, position = "topleft", className="map-title") %>%
  ## mini reference map
  addMiniMap() %>%
  ## neighborhood boundary
  addPolygons(data = phillyNeighbor,
              color = 'skyblue',
              weight = 1)  %>%
  
    
  addCircleMarkers(data = ageraceplot, 
                   color = "white",
                   weight = 2,
                   label = "Distribution of Age by Race",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "ageraceplot") %>%
  addPopupImages(ar, 
                  width = 500,
                  height = 320,
                  tooltip = FALSE,
                  group = "ageraceplot") %>%
    
 addCircleMarkers(data = yearplot, 
                   color = "skyblue",
                   weight = 2,
                   label = "Incident Rate by Year",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "yearplot") %>%
  addPopupImages(yr, 
                  width = 500,
                  height = 320,
                  tooltip = FALSE,
                  group = "yearplot") %>%

  
  addCircleMarkers(data = outdoorplot, 
                   color = "darkgreen",
                   weight = 2,
                   label = "Rate of Indoor vs Outdoor Incidents",
                   stroke = FALSE, 
                   fillOpacity = 0.95,
                   group = "outdoorplot") %>%
  addPopupImages(out, 
                   width = 500,
                  height = 320,
                   group = "outdoorplot" ) %>%


 
  
  ## plot information on the map
  addCircleMarkers(data = philly.data2,
                   color = ~pal(as.factor(fatal)),
                   stroke = FALSE, 
                   fillOpacity = 0.5,
                   popup = ~popupTable(philly.data2),
                   clusterOptions = markerClusterOptions(maxClusterRadius = 40)) %>%

  
          addLayersControl(baseGroups = c('Dark', 'DarkLabel', 'Esri'),
                   overlayGroups = c("Crime Data"),
                   options = layersControlOptions(collapsed = TRUE)) %>%
  ##
  browsable()

```



The above map is fully interactive, maintains our red/blue fatal/nonfatal color coding from the previous example, but allows us to provide much more information for each incident, as well as include aggregate data visualizations (by clicking on the dots blocking out Media on the map).



# Map 4: Mapping Data with Tableau


## Overview

Our dataset contains presidential election data for each county in the continental United States. Our goal is to create an interactive map in Tableau that displays county-level information for each presidential election from 2000-2020. First, we must import the data. We have been instructed to only consider the two major American political parties, Republican and Democrat, for this visualization.


```{r}

##Voting Data
vote<-read.csv("https://pengdsci.github.io/datasets/countypresidential_election_2000-2020.csv")
fips<-read.csv("https://pengdsci.github.io/datasets/fips2geocode.csv")

colnames(fips)[1]<-"county_fips"

##subset data
vote.min <- subset(vote, party %in% c("REPUBLICAN", "DEMOCRAT"), 
                    select = c("state_po", "county_name", "candidate", "county_fips", "party", "candidatevotes", "year"))
```


## Preparing the Data

Using `dplyr` and `tidyr` packages, we are:
+ Counting all votes in a given county and year - being mindful of state, because multiple counties in different states share a name
+ Removing the losing candidate's data for a given county and year - still mindful of state
+ Creating a variable for the winning candidate's percentage of the vote
+ Merging our finished dataset with the data about each county


```{r}
total_votes <- vote.min %>%
  group_by(state_po, county_name, year) %>%
  summarise(total_votes = sum(candidatevotes))


winning_party_data2 <- vote.min %>%
  group_by(state_po, county_name, year) %>%
  filter(candidatevotes == max(candidatevotes)) %>%
  ungroup() %>%
  left_join(total_votes, by = c("state_po", "county_name", "year")) %>%
  mutate(winning_percentage = candidatevotes / total_votes * 100)


vote3.merge <- merge(winning_party_data2, fips, by = "county_fips", all.x = TRUE)

```


## Mapping the Data with Tableau

Tableau is very intuitive and allows one to map data by clicking and dragging desired variables to certain function (ie, displaying the Winning Party variable as 'color'). The downside of this being: there is no code to share!

The Tableau map is embedded below. It is interactive as well - mousing over a county should give you its election information. The year can be adjusted to see data from each presidential election from 2000 to 2020.  



<table border = 0 bordercolor="darkgreen" bgcolor='#f6f6f6'  width=100%  align = center>
<tr>
<td>



<div class='tableauPlaceholder' id='viz1712872926954' style='position: relative'>
<noscript>
<a href='#'>
<img alt='County-Level Presidential Election Results, 2000-2020 ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;vo&#47;votingmap2000-2020&#47;Sheet1&#47;1_rss.png' style='border: none' />
</a>
</noscript>
<object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> 
<param name='embed_code_version' value='3' /> 
<param name='site_root' value='' /><param name='name' value='votingmap2000-2020&#47;Sheet1' /><param name='tabs' value='no' />
<param name='toolbar' value='yes' />
<param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;vo&#47;votingmap2000-2020&#47;Sheet1&#47;1.png' /> 
<param name='animate_transition' value='yes' />
<param name='display_static_image' value='yes' />
<param name='display_spinner' value='yes' />
<param name='display_overlay' value='yes' />
<param name='display_count' value='yes' />
<param name='language' value='en-US' />
<param name='filter' value='publish=yes' />
</object>
</div>               
<script type='text/javascript'>  
var divElement = document.getElementById('viz1712872926954');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

</td>
</tr>
</table>