Data 608 Final Project

NYC 311 complaint analysis

Write-up on visualization

Data Source

Parameters of dataset

NYC 311 dataset is formed from or all the data points are derived from New York.

Now, since data source is socrata API, It pulls the latest dataset from the NYC 311 and it updated on a daily basis. Therefore, whenever the API is called, daily datasets are called.

For twitter dataset, it pulls 1100 latest tweets mentioning NYC 311.

Some of the main data points for the NYC 311 dataset are:

“agency”
“agency_name”
“complaint_type”
“descriptor”
“incident_zip”
“incident_address”
“street_name”
“cross_street_1”
“cross_street_2”
“intersection_street_1”
“intersection_street_2”
“status”
“community_board”
“borough”
“x_coordinate_state_plane”
“y_coordinate_state_plane”
“open_data_channel_type”
“park_facility_name”
“park_borough”
“latitude”
“longitude”
“location”
“resolution_description”
“resolution_action_updated_date”

About NYC Open Data set

Beginning in 2010, NYC launched an initiative to expose government data via NYC Open Data in an effort to “improve the accessibility, transparency, and accountability of City government, this catalog offers access to a repository of government-produced, machine-readable data sets.”

What dataset shows and Why is it important

NYC 311’s mission is to provide the public with quick, easy access to all New York City government services and information while offering the best customer service. It help Agencies improve service delivery by allowing them to focus on their core missions and manage their workload efficiently.

NYC 311 data is updated on a daily basis and is provided by DoITT where currently I am pursuing my internship. Therefore, I wanted to apply visualization concepts studied in Data 608 to analyze this data set.

Aim

To analyze and build visualizations for issues around New York City (including Manhattan, Queens, Brooklyn, and Bronx) by frequency of reported incidents in each area.
NYC 311 Service Requests & Resolution Analysis through Text Mining
Explore and analyze NYC 311 Service requests (historical data sets) to understand diverse patterns, regular themes and trends, as well as community satisfaction levels derived from resolution categories and timing.
I would also want to do sentiment analysis using Syuzhet Package on the NYC 311 twitter comments to determine “nyc311” Tweet’s Emotions especially during the period of virus outbreak and also create visualization for same.

Import libraries

Load all the necessary packages

library(plyr)
library(tidyverse)
library(knitr)
library(jsonlite)

Load the data using socrata API

Analyze the dataset with socrata API

api_endpoint <- "https://data.cityofnewyork.us/resource/erm2-nwe9.json"

json_dataset311 <- fromJSON(paste0(api_endpoint))
class(json_dataset311)

## [1] "data.frame"

Display column Names and no. of rows

##--------- Column names

colnames(json_dataset311)

##  [1] "unique_key"                     "created_date"                  
##  [3] "agency"                         "agency_name"                   
##  [5] "complaint_type"                 "descriptor"                    
##  [7] "location_type"                  "incident_zip"                  
##  [9] "incident_address"               "street_name"                   
## [11] "cross_street_1"                 "cross_street_2"                
## [13] "intersection_street_1"          "intersection_street_2"         
## [15] "city"                           "landmark"                      
## [17] "status"                         "community_board"               
## [19] "bbl"                            "borough"                       
## [21] "x_coordinate_state_plane"       "y_coordinate_state_plane"      
## [23] "open_data_channel_type"         "park_facility_name"            
## [25] "park_borough"                   "latitude"                      
## [27] "longitude"                      "location"                      
## [29] ":@computed_region_efsh_h5xi"    ":@computed_region_f5dn_yrer"   
## [31] ":@computed_region_yeji_bk3q"    ":@computed_region_92fq_4b7q"   
## [33] ":@computed_region_sbqj_enih"    "closed_date"                   
## [35] "resolution_description"         "resolution_action_updated_date"
## [37] "address_type"                   "facility_type"                 
## [39] "taxi_pick_up_location"

##--------- No. of rows

nrow(json_dataset311)

## [1] 1000

First 5 rows of dataset

head(json_dataset311,5)

Data Exploration and Visualization

1) Top 50 most common complain types

dataset311 <- json_dataset311
ggplot(subset(dataset311, complaint_type %in% count(dataset311, complaint_type, sort=T)[1:50,]$complaint_type), aes(complaint_type)) + 
  geom_histogram(stat = "count",color="black", fill="purple") +
  labs(x="Complaint Type", y="Service Requests") +
  coord_flip() + theme_bw()

As we see above, highest number of service requests is for the Noise-residential complain type followed by Noise-street/sidewalk.

2) Most common complaint types by borough and status

No. of complaints/ Count of complaints by borough and status:

dataset_borough <- subset(dataset311, complaint_type %in% count(dataset311, complaint_type, sort=T)[1:50,]$complaint_type)

dataset_borough <- dataset_borough %>% select(complaint_type, borough, status) %>% filter(!str_detect(borough, "Unspecified"))

ggplot(dataset_borough, aes(x=status, y = complaint_type)) +
  geom_point() +
  geom_count(n=2, colour="darkgreen") + 
  facet_wrap(~borough)

As we analyze from the graph above, Bronx, Brooklyn and Manhattan has over 100 complaints which are at the closed status, which shows a good progress to solve complaints by NYC 311.

Service Request Resolutions Tidying and Analysis - Using Tidytext

In this section, we will analyse frequent words used by Service Request Resolutions,

Let’s use Tidytext for this purpose.

Most frequent words used in NYC311 Service Requests

The following step also filters the data having value as “NA” and does not include it in the tokenized_resolutions dataset

library(tidytext)

data(stop_words)
tokenized_resolutions <- dataset311 %>%
  select(complaint_type, descriptor, street_name, city, resolution_description, borough, open_data_channel_type) %>%
  filter(!str_detect(borough, "Unspecified")) %>% 
  filter(!str_detect(resolution_description,"NA")) %>% 
  unnest_tokens(word, resolution_description) %>%
  anti_join(stop_words) %>%
  group_by(borough, word) %>%
  tally()

## Joining, by = "word"

tokenized_resolutions %>% glimpse()

## Rows: 281
## Columns: 3
## Groups: borough [5]
## $ borough <chr> "BRONX", "BRONX", "BRONX", "BRONX", "BRONX", "BRONX", "BRONX"…
## $ word    <chr> "act", "action", "additional", "arrival", "attempt", "complai…
## $ n       <int> 1, 65, 2, 3, 32, 128, 95, 32, 32, 32, 32, 100, 5, 4, 21, 60, …

Analyze internal structure of tokenized_resolutions

str(tokenized_resolutions)

## tibble [281 × 3] (S3: grouped_df/tbl_df/tbl/data.frame)
##  $ borough: chr [1:281] "BRONX" "BRONX" "BRONX" "BRONX" ...
##  $ word   : chr [1:281] "act" "action" "additional" "arrival" ...
##  $ n      : int [1:281] 1 65 2 3 32 128 95 32 32 32 ...
##  - attr(*, "groups")= tibble [5 × 2] (S3: tbl_df/tbl/data.frame)
##   ..$ borough: chr [1:5] "BRONX" "BROOKLYN" "MANHATTAN" "QUEENS" ...
##   ..$ .rows  :List of 5
##   .. ..$ : int [1:40] 1 2 3 4 5 6 7 8 9 10 ...
##   .. ..$ : int [1:64] 41 42 43 44 45 46 47 48 49 50 ...
##   .. ..$ : int [1:78] 105 106 107 108 109 110 111 112 113 114 ...
##   .. ..$ : int [1:59] 183 184 185 186 187 188 189 190 191 192 ...
##   .. ..$ : int [1:40] 242 243 244 245 246 247 248 249 250 251 ...
##   ..- attr(*, ".drop")= logi TRUE

Let’s see first few rows of tokenized_resolutions

head(tokenized_resolutions)

Now let’s look for the top 25 most frequent word used in complaints by 5 boroughs:

tokenized_resolutions %>%
  group_by(borough) %>%
  top_n(25) %>%
  arrange(desc(n)) %>%
  ggplot(aes(x = reorder(word,n), y = n, fill = factor(borough))) +
  geom_bar(stat = "identity") +
  theme(legend.position = "none") +
  facet_wrap(~borough, scales = "free") + 
  coord_flip() +
  labs(x = "Words",
       y = "Frequency",
       title = "Top words used in NYC311 Service Requests by Borough",
       subtitle = "")

As we see above, in all 5 boroughs, most frequently used word is police followed by department, complaint,responded.

Determining terms/words truly characteristic for SRs by Borough leveraging textmining (TF-IDF)

In information retrieval, tf–idf or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.

tf_idf_words <- tokenized_resolutions %>%
  bind_tf_idf(word, borough, n) %>%
  arrange(desc(tf_idf))
tf_idf_words

Presenting characterisitc terms/words for SRs by Borough

Let’s analyze some distinctive words used by each borough

tf_idf_words %>% 
  top_n(25) %>%
  arrange(desc(tf_idf)) %>%
  ggplot(aes(x = reorder(word, tf_idf), y = tf, fill = borough)) +
  geom_col() +
  labs(x = "Words", y = "tf",
       title = "Distinctive words used in NYC311 Service Requests by Borough",
       subtitle = "") +
  coord_flip() +
  theme(legend.position = "none") +
  facet_wrap(~ borough, scales = "free")

As we can infer from the graph above, most distinctive words used by boroughs are

Bronx: reviewed followed by provided
Manhattan: unable followed by premises
Brooklyn: reviewed followed by provided
Quees: reported followed by city
Staten Island: violation followed by time

Map Analysis

Now let’s analyse the NYC 311 data using Map Analysis

Preparing and tidying up the data for map plotting

dataset_map <- subset(dataset311, complaint_type %in% count(dataset311, complaint_type, sort=T)[1:50,]$complaint_type)
dataset_map <- dataset_map %>% select(complaint_type, borough, latitude, longitude) %>% drop_na()

library(plyr)
counts <- ddply(dataset_map, .(complaint_type), "count")
counts_filtered <- filter(counts, freq > 2)
counts_filtered$freq <- as.numeric(counts_filtered$freq)
counts_filtered$longitude <- as.numeric(counts_filtered$longitude)
counts_filtered$latitude <- as.numeric(counts_filtered$latitude)

Dataset of counts_filtered in which the count of complaints is greater than 2

counts_filtered

Map Plotting

Now since we have the map plotting dataset prepared, let’s plot those langitude and latitude points, and analyze highest no. of service requests by complaint types

#install.packages("rworldmap")
#install.packages("rworldxtra")

library(rworldmap)
library(rworldxtra)

newmap <- getMap(resolution = "high")
nyc_coorflimits <- data.frame( long = c(-74.5, -73.5), lat = c(40.5, 41), stringsAsFactors = FALSE)

nyc <- ggplot() + geom_polygon(data = newmap, aes(x=long, y = lat, group = group), fill = "gray", color = "blue")  + xlim(-74.5, -73.5) + ylim(40.5, 41)

nyc_SRs <- nyc + 
  geom_point(data=counts_filtered, aes(longitude, latitude, size=freq), colour="red")  + 
  facet_wrap(~complaint_type, scales = "free") + 
  labs(x = "Longitude", y = "Latitude", title = "Highest Number of SRs by Complaint Type") + scale_size(name="# of SRs")
nyc_SRs

As we see from the graph above, Noise-Residential has more number of service requests by complaint type.

Service Request Resolutions Tidying and Analysis - `Using TM`

tm vignette is meant for text mining in R utilizing the text mining framework provided by the tm package.

Load required libraries

library(tm)
library(wordcloud)

Filtering dataset to the most relevant Complaint Types

dataset_filt <- subset(dataset311, complaint_type %in% count(dataset311, complaint_type, sort=T)[1:50,]$complaint_type)
sr_resolution <- dataset_filt$resolution_description

Cleaning up non-standard characters (encoding conversion)

sr_resolution_cln <- sr_resolution %>% iconv("latin1", "ASCII")
control <- list(stopwords=TRUE, removePunctuation=TRUE, removeNumbers=TRUE, minDocFreq=5) # stemming=TRUE does not provide much value

Creating Corpus and TDM

sr_corpus <- VCorpus(VectorSource(sr_resolution_cln))
sr_tdm <- TermDocumentMatrix(sr_corpus, control)
sr_tdm

## <<TermDocumentMatrix (terms: 121, documents: 998)>>
## Non-/sparse entries: 8179/112579
## Sparsity           : 93%
## Maximal term length: 15
## Weighting          : term frequency (tf)

Removing sparse terms (80% of sparse percentage of empty)

sr_tdm_unsprsd <- removeSparseTerms(sr_tdm, 0.8)
sr_tdm_unsprsd

## <<TermDocumentMatrix (terms: 10, documents: 998)>>
## Non-/sparse entries: 5414/4566
## Sparsity           : 46%
## Maximal term length: 11
## Weighting          : term frequency (tf)

Top terms by frequency (mentioned at least 50 times)

length(findFreqTerms(sr_tdm_unsprsd,50))

## [1] 10

Displaying top terms

sr_topterms <- findFreqTerms(sr_tdm_unsprsd,50)
sr_topterms

##  [1] "action"      "available"   "complaint"   "condition"   "department" 
##  [6] "fix"         "information" "police"      "responded"   "took"

Find top associations using findAssocs() for the top terms (lower correlation limit of 0.4). More consistent term association patterns found in service requests

sr_topterms <- sr_topterms[!is.na(sr_topterms)]
sr_assocs <- findAssocs(sr_tdm_unsprsd, sr_topterms[1:5], 0.4) 
lapply(sr_assocs, function(x) kable(x))

## $action
## 
## 
##                 x
## ----------  -----
## fix          0.88
## took         0.88
## responded    0.59
## police       0.54
## 
## $available
## 
## 
##                   x
## ------------  -----
## information    0.84
## 
## $complaint
## 
## 
##             x
## -------  ----
## police    0.4
## 
## $condition
## 
## 
##           x
## -----  ----
## fix     0.6
## took    0.6
## 
## $department
## 
## 
##                 x
## ----------  -----
## police       0.84
## responded    0.82
## fix          0.43
## took         0.43

As per the above association figures,

action is associated to word:

fix by 85%
took by 85%
police by 48%
responded by 46%

Similarly, we can interpret for other words.

Creating a WordCloud for the top terms/words in the SRs

library(wordcloud)
sr_tdm_cloud <- as.matrix(sr_tdm_unsprsd)
v <- sort(rowSums(sr_tdm_cloud),decreasing=TRUE)
d <- data.frame(word=names(v),freq=v)   
wordcloud(d$word,d$freq,max.words=50, min.freq=10, colors=brewer.pal(8, 'Dark2'))

NYC311 Tweets Analysis

Now, let’s do NYC 311 tweet analysis,

Data Collection and Exploration

API Set-up (Application Name and security context). Commands commented and keys masked

##------- store api keys (these are fake example values; replace with your own keys)

library(rtweet)

api_key <- "aaa"
api_secret_key <- "bbb"
access_token <- "ccc"
access_token_secret <- "ddd"

Search and collect 1100 tweets doing any mention to the “nyc311” service (hashtag, user, follower, etc.)

nyc311_tweets <- search_tweets("nyc311", n = 1100 )

head(nyc311_tweets$text,5) %>% kable()

x
@KGRLogic Good afternoon, thank you for reaching out. Please DM me with details on the type of inspection you requested. Thanks! https://t.co/hDTCua1AH9
@willardk Good afternoon, please send us a DM so I may ask you a few questions about this food delivery. Thank you! https://t.co/hDTCu9JZPB
@pixistik04 Good evening. You can report a fire hydrant that’s open online here: https://t.co/XGvdzZiRb8 or by sending us a DM for help with reporting. https://t.co/hDTCu9JZPB
@immichaelmorgan @NYCMayor @NYCMayorsOffice Good evening. You can get information and guidance about DMV service changes online at https://t.co/KXNs23W0Ry or you can reach out to them by phone at (718) 966-6155 Monday through Friday from 8:30 AM to 4 PM. Thanks!
@andrewPnelson2 Hi, all non-essential construction in NYC has been halted. DOB created a Real-Time Essential Construction Map, which shows the location of allowed essential construction sites in NYC. If a worksite isn’t on the map, DM us to file a report. https://t.co/hDTCu9JZPB

Sample of Users tweeting about “nyc311”

nyc311_users <- users_data(nyc311_tweets) %>% unique()
kable(head(head(nyc311_users[,c(3,4,8,9)])))

name	location	followers_count	friends_count
New York City 311	New York City	346557	238
Yalaisa Wright	United States	75	295
Boerum Hill Neighbors	Brooklyn, NY	518	1702
Kevin	New York, NY	221	451
Nicholas F		382	1330
Eagle One 🇺🇸🦅	’Merica	277	307

Let’s plot “nyc311” Tweets Time series (Last 7-9 days)

ts_plot(nyc311_tweets, "24 hours", col=c("blue")) + theme_minimal() + theme(plot.title = ggplot2::element_text(face = "bold")) + labs(x = "Date", y = "# of Tweets", title = "NYC311 Tweets in the last 7 days")

We can see a very interesting graph above, there have been consecutive increase and decrease in the no. of tweets from May 03 to May 08, but there’s a drastic decrease in no. of tweets to nyc 311 betwwen May 10 and May 12, one of the main reasons can be due to covid-19.

Sentiment Analysis - Syuzhet Package

Syuzhet breaks the text/words into 10 different emotions - anger, anticipation, disgust, fear, joy, sadness, surprise, trust, negative and positive.

Let’s determine “nyc311” Tweet’s Emotions

#devtools::install_github("mjockers/syuzhet")
library(syuzhet)

nyc311_tweets_txt <- as.vector(nyc311_tweets$text)
emotion_df <- get_nrc_sentiment(nyc311_tweets_txt)
twt_emotion_df <- cbind(nyc311_tweets_txt, emotion_df) 
kable(head(twt_emotion_df,3))

nyc311_tweets_txt	anticipation	fear	joy	surprise	trust	positive
@KGRLogic Good afternoon, thank you for reaching out. Please DM me with details on the type of inspection you requested. Thanks! https://t.co/hDTCua1AH9	1	0	1	1	1	1
@willardk Good afternoon, please send us a DM so I may ask you a few questions about this food delivery. Thank you! https://t.co/hDTCu9JZPB	2	0	2	1	2	3
@pixistik04 Good evening. You can report a fire hydrant that’s open online here: https://t.co/XGvdzZiRb8 or by sending us a DM for help with reporting. https://t.co/hDTCu9JZPB	1	1	1	1	1	1

Sentiment Scoring

The core idea of sentiment scores is to put the number of positive reviews in relation to the number of negative reviews.

sent.value <- get_sentiment(nyc311_tweets_txt)

Let’s have a look at Positive Tweets

positive.tweets <- nyc311_tweets_txt[sent.value > 0]
kable(head(positive.tweets,5))

x
@KGRLogic Good afternoon, thank you for reaching out. Please DM me with details on the type of inspection you requested. Thanks! https://t.co/hDTCua1AH9
@willardk Good afternoon, please send us a DM so I may ask you a few questions about this food delivery. Thank you! https://t.co/hDTCu9JZPB
@pixistik04 Good evening. You can report a fire hydrant that’s open online here: https://t.co/XGvdzZiRb8 or by sending us a DM for help with reporting. https://t.co/hDTCu9JZPB
@immichaelmorgan @NYCMayor @NYCMayorsOffice Good evening. You can get information and guidance about DMV service changes online at https://t.co/KXNs23W0Ry or you can reach out to them by phone at (718) 966-6155 Monday through Friday from 8:30 AM to 4 PM. Thanks!
@andrewPnelson2 Hi, all non-essential construction in NYC has been halted. DOB created a Real-Time Essential Construction Map, which shows the location of allowed essential construction sites in NYC. If a worksite isn’t on the map, DM us to file a report. https://t.co/hDTCu9JZPB

Most Positive Tweet

most.positive <- nyc311_tweets_txt[sent.value == max(sent.value)]
most.positive

## [1] "@PQuinceNYC @HelenRosenthal @NYCDOB Good morning, please send us a Direct Message. We have a few questions to clarify what is happening at the construction site to ensure that we file the correct report. Thank you. https://t.co/hDTCu9JZPB"

Let’s have a look at Negative Tweets

negative.tweets <- nyc311_tweets_txt[sent.value < 0]
kable(head(negative.tweets,5))

x
.@NYCDHS’s Code Blue is in effect until tomorrow, Sunday, May 10 at 8:00 AM. If you see a homeless person outside in these frigid temperatures, please call us at 311. https://t.co/jEaQyOxlxc
@domiruiz02 Hi, thank you for your tweets. Call 911 to report an emergency situation or condition that might cause danger to life or personal property and to report a medical or health-related emergency: https://t.co/Gf62x24xHN.
@DiamondVMedia @NYC_DOT @Pollytrott @NYCSpeakerCoJo @BPEricAdams @NYCMayor @NYGovCuomo Good morning, if the potholes are dangerous and likely to cause an accident, call 911. You can report potholes at https://t.co/MkR064QHhv or DM me and I’ll file for you. https://t.co/hDTCu9JZPB
UPDATE: #NYCASP rules are suspended through Sunday, May 17.

#NYCASP resumes Monday, May 18 through Sunday, May 24 for a citywide clean sweep.

#NYCASP rules will then be suspended again through Sunday, June 7.

Parking meters will remain in effect.

Follow @NYCASP for more. https://t.co/Qfh0v6R3Ia | |@megshashin @NYCMayor @NYCMayorsOffice Good morning, we’re sorry to hear about your experience. If you believe you’ve been discriminated against, you can file a complaint with NYC Commission on Human Rights at https://t.co/FvPrtMXcLe or send us a DM. https://t.co/hDTCu9JZPB |

Most Negative Tweet

most.negative <- nyc311_tweets_txt[sent.value <= min(sent.value)] 
most.negative

## [1] "@nyc311 @NYPD13Pct @CarlinaRivera I reported a recurring homeless condition in Gramercy. It was referred to the NYPD and subsequently closed as “non crime corrected” It’s a disgusting and unhealthy situation.  He’s  defecating on the sidewalk. 333 East 23 Street b/t 1st and 2nd https://t.co/Huc4taBHkc"

Let’s now see Neutral Tweets

neutral.tweets <- nyc311_tweets_txt[sent.value == 0]
kable(head(neutral.tweets,5))

x
#NYCASP Las reglas de estacionamiento alterno están suspendidas hoy, sábado, 9 de mayo, hasta el martes, 12 de mayo. Los parquímetros permanecerán en efecto. Sigue @NYCASP y baja la aplicación móvil para recibir alertas directas a tu teléfono: https://t.co/9GSt3VfwSg https://t.co/8oFLl91kmn
#NYCASP Las reglas de estacionamiento alterno están suspendidas hoy, miércoles, 13 de mayo. Los parquímetros permanecerán en efecto.

Sigue @NYCASP y baja la aplicación móvil para recibir alertas directas a tu teléfono: https://t.co/9GSt3VfwSg | |#NYCASP Las reglas de estacionamiento alterno están suspendidas hoy, jueves, 7 de mayo, hasta el martes, 12 de mayo. Los parquímetros permanecerán en efecto. Sigue @NYCASP y baja la aplicación móvil para recibir alertas directas a tu teléfono: https://t.co/9GSt3VfwSg https://t.co/7OmeoBoGbv | |#NYCASP Las reglas de estacionamiento alterno están suspendidas hoy, martes, 12 de mayo. Los parquímetros permanecerán en efecto. Sigue @NYCASP y baja la aplicación móvil para recibir alertas directas a tu teléfono: https://t.co/9GSt3VfwSg https://t.co/bbZezAr66f | |#NYCASP Las reglas de estacionamiento alterno están suspendidas hoy, miércoles, 6 de mayo, hasta el martes, 12 de mayo. Los parquímetros permanecerán en efecto. Sigue @NYCASP y baja la aplicación móvil para recibir alertas directas a tu teléfono: https://t.co/9GSt3VfwSg https://t.co/TMiRuFsyUf |

Total Tweets by Sentiment using plotly package

#install.packages("plotly")
library(plotly)
category_sent <- ifelse(sent.value < 0, "Negative", ifelse(sent.value > 0, "Positive", "Neutral"))
totals <- data.frame(table(category_sent))
plot_ly(totals, x = ~category_sent, y = ~Freq, type = 'bar',
        marker = list(color = c('red', 'orange',
                                'green'))) %>% layout(title = 'NYC311 Tweets by Sentiment')

Conclusion

Based on all the analyses performed, the NYC311 Service represents a very popular and reliable channel and resource for the NYC communities to raise awareness to the local agencies and citizen services providers about multiple topics of importance and well-being for the society.

I was able to identify overall themes and topics affecting the main boroughs within the NY Metro area but more importantly, I was able to narrow down characteristic themes and patterns that were more prevalent in each one, providing an idea of the specific challenges, needs and local dynamics each borough community experiments on a quotidian basis.

In terms of Sentiment Analysis for the “nyc311” tweets, the majority of them describe a positive sentiment , surprisingly not a considerable number of complaints or negative mentions being raised leveraging the Twitter channel and also, the NYC311 service uses it to provide resolution advice, status and redirection guidance to its users/followers.

Issues faced during the creation of this Analytics project

Twitter developer account - Process of getting permission to create twitter account has been modiefied and upgraded and requires much smaller details which is then reviewed by the twitter. It was a 3 day process to explain about how and where I will be using NYC 311 twitter data, but I finally got permissions to create app with twitter developer account.
Map plotting in RMarkdown - Map plotting code to plot maps of the NYC area with SR statistics overlayed into multiple facets by complaint type worked perfectly in the RStudio Console. Once I tried the code within R Markdown it threw an exception/error not supporting facets and not overlaying SR statistics. I added a picture of the correct plot right after the affected code section as a reference.

Data 608 Final Project