Aim

Analysis of Active Tobacco Retailer Dealer Licenses in Queens.

Data

NYC Open Data: NYC Open Data

Import Libraries:

library(tidyverse)
library(tidycensus)
library(RSocrata)
library(ggcharts) 
library(ggblanket) 
library(knitr)
library(DT)
library(sf)
library(scales)
library(viridis)

Import the data directly into RStudio using url path

OpenData <- read.socrata("https://data.cityofnewyork.us/resource/adw8-wvxb.csv")

Select the columns needed, filter by Queens Borough and rename them. I used the ZIP code to be able to compare the amount of Active Tobacco Retailer inbetween Queens.

columns <-c("Business.Name","Address.Borough","Address.ZIP","Latitude","Longitude","Location")
OpenDataTabacco <- OpenData %>%
  select(all_of(columns)) %>%
  filter(Address.Borough == "Queens") %>% 
  rename("NAME"="Business.Name",
         "BOROUGH"="Address.Borough",
         "ZIPcode"="Address.ZIP")

Made a Data Frame with the Name, Borough, Zip Code, Latitude, Longitude and Location. The three last columns I put because I tried to make a map, but I could not manage to do a Spatial Joint.

OpenDataTabacco %>% 
  datatable()

To explore a bit more our dataset I made a SCATTER PLOT. Even though is not so useful as a map, at least show the abstract location of the Active Tobacco Retailer inbetween Queens.

ggplot(OpenDataTabacco, aes(x = Latitude, y = Longitude)) +
  geom_point(size = .5) + 
  scale_x_continuous(labels = comma) +
  scale_y_continuous(labels = comma) + 
  labs(x = "Latitude", y = "Longitude",
       title = "Location",
       caption = "Source: NYC Open Data")

Statistics

Calculated summary statistics by ZIP Code

OpenDataTabacco_Stat <- OpenDataTabacco %>% 
  group_by(ZIPcode) %>% 
  summarise(ActiveTabacco = n()) %>% 
  mutate(ZIP_code = as.character(ZIPcode))

I plot a bar chart that compares the amount of Active Tobacco Retailer in between Queens by Zip Code.

ggplot(data=OpenDataTabacco_Stat,
       aes(x=reorder(ZIP_code,ActiveTabacco),
                  y=ActiveTabacco)) +
         geom_col() +
         theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.1)) +
         labs(x = "ZIPcode", y = "ActiveTabacco",
              title = "Active Tabacco Business",
              caption = "NYC Open Data")