#Examen 4
El análisis locacional permite reconocer patrones en zonas geográficas a través de la asociación de datos relacionados con las actividades económicas, sociales y de movilidad, lo que facilita la identificación de tendencias y la toma de decisiones estratégicas. Este reporte muestra un ejercicio práctico de un análisis locacional en la zona metropolitana de Monterrey mediante el uso del API de Google Maps Reviews, con el objetivo de evaluar la distribución y percepción de establecimientos y servicios en la región.
Un factor determinante en la toma de decisiones en los negocios es su posición espacial. La ubicación determinará el comportamiento y demografía de los clientes. Abarcar la Inteligencia Locacional facilita la segmentación de clientes, optimiza operaciones logísticas, mejora la gestión de riesgos y ayuda a identificar oportunidades de mercado. Además, su integración permite a las empresas personalizar estrategias, reducir costos y maximizar el impacto de sus acciones en diferentes ubicaciones.
El Análisis de Sentimientos (Sentiment Analysis) es una técnica de procesamiento de lenguaje natural (NLP) para identificar y clasificar emociones expresadas en textos. Utiliza algoritmos de inteligencia artificial para determinar si un mensaje tiene un tono positivo, negativo o neutral.
Esta herramienta es clave en estrategias de Inteligencia de Negocios (BI) porque ayuda a las empresas a comprender las opiniones de los clientes, medir la satisfacción, anticipar crisis de reputación y mejorar productos o servicios. Además, permite analizar tendencias de mercado, evaluar el impacto de campañas publicitarias y tomar decisiones basadas en datos emocionales.
### LOADING REQUIRED LIBRARIES
# data analysis
library(dplyr) # grammar of data manipulation helping to resolve data manipulation difficulties
library(Hmisc) # useful functions for data analysis and high - level graphics
library(foreign) # read data stored by Minitab, SPSS, Stata
library(openxlsx) # open, read, write,and edit xlsx files
library(tidyverse) # collection of R packages designed for data science
# spatial data analysis
library(leaflet.extras) # to provide extra functionality to the leaflet R package
library(sp) # functions for plotting data as maps, spatial selection, methods for retrieving coordinates
library(sf) # encode spatial vector data
library(maps) # geographic maps
library(tmap) # generate thematic maps
library(spdep) # a collection of functions to create spatial weights matrices from polygon contiguities
library(terra) # methods for spatial data analysis
library(leaflet) # interactive maps
library(mapproj) # map projections
library(mapsapi) #
library(spatialreg) # spatial regression models
library(sfdep) # an interface to 'spdep' to integrate with 'sf' objects and the 'tidyverse
library(tidygeocoder) # makes getting data from geocoding services easy
library(mapboxapi) # 'Mapbox' Navigation APIs, including directions, isochrones, and route optimization.
# visualization
library(ggmap) # spatial data visualization
library(rgeoda) # spatial data analysis based on software GeoDa
library(ggplot2) # Grammar of graphics. System for declarative creating graphics
library(corrplot) # provides a visual exploratory tool on correlation matrix
library(RColorBrewer) # offers several color palettes
library(leafsync) # create small multiples of several leaflet web maps
library(htmltools) # tools for HTML generation and output
# others
library(rlang) # collection of frameworks and APIs for programming with R
library(classInt) # methods for choosing univariate class intervals for mapping or other graphic purposes
library(gridExtra) # to arrange and combine plots for easy comparison
library(knitr) # integrates computing and reporting
### Getting access to distance, reviews, and ratings by using Google Maps
library(tm) # a framework for text mining applications
library(wordcloud) # functionality to create pretty word clouds
library(wordcloud2) #
library(googleway) # provides a mechanism to access various Google Maps APIs, including plotting a Google Map from R and overlaying it with shapes and markers, and retrieving data from the places, directions, roads, distances, geocoding, elevation and timezone APIs
library(gmapsdistance) # allows to calculate distances for a database through Google maps
# library(hereR) # geocode and autocomplete addresses or reverse geocode POIs using the Geocoder API
library(osrm) # enables the computation of routes, trips, isochrones and travel distances matrices (travel time and kilometric distance).
### Text Mining
library(tm) # text mining functions
library(syuzhet) # includes four sentiment dictionaries and provides a method for accessing the robust, but computationally expensive, sentiment extraction tool developed in the NLP group at Stanford.
library(SnowballC)
library(wordcloud)
library(wordcloud2)
# library(remotes)
# library(openrouteservice)
# remotes::install_github("GIScience/openrouteservice-r")
Mty <- leaflet() %>%
addTiles() %>%
setView(-100.31094, 25.66928, zoom = 21) %>% ### Mty downtown area
addMarkers(-100.31094, 25.66928, popup = "Monterrey Downtown Area")
Mty
### Map coordinates of Parque Fundidora (Starting Point)
latitude <- 25.68636
longitude <- -100.31831
r <- 30000
### Use the google_places function to make a call to the API and save the results
search_str <- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key)
### we can visualize the results (hospitals) got from google_places()
# search_str$results
### The page_token is the way to tell Google to return the next 20 results in the search instead of only the first 20
search_str_add_one <- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key, page_token = search_str$next_page_token)
### The page_token is the way to tell Google to return the next 20 results in the search instead of only the first 40
search_str_add_two <- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key, page_token = search_str_add_one$next_page_token)
search_str_add_three<- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key, page_token = search_str_add_one$next_page_token)
search_str_add_four<- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key, page_token = search_str_add_one$next_page_token)
search_str_add_five<- google_places(search_string = 'lavanderias', location=c(latitude,longitude), radius=r, key=gmaps_key, page_token = search_str_add_one$next_page_token)
### El dataframe proporcionará información detallada sobre cada lugar, incluyendo su dirección, coordenadas de latitud y longitud, nivel de precios, calificación por estrellas, número de calificaciones, categorías y más.
business_name <- c(search_str$results$name,
search_str_add_one$results$name,
search_str_add_two$results$name,
search_str_add_three$results$name,
search_str_add_four$results$name,
search_str_add_five$results$name)
business_rating <- c(search_str$results$rating,
search_str_add_one$results$rating,
search_str_add_two$results$rating,
search_str_add_three$results$rating,
search_str_add_four$results$rating,
search_str_add_five$results$rating)
user_ratings_total <- c(search_str$results$user_ratings_total,
search_str_add_one$results$user_ratings_total,
search_str_add_two$results$user_ratings_total,
search_str_add_three$results$user_ratings_total,
search_str_add_four$results$user_ratings_total,
search_str_add_five$results$user_ratings_total)
place_id <- c(search_str$results$place_id,
search_str_add_one$results$place_id,
search_str_add_two$results$place_id,
search_str_add_three$results$place_id,
search_str_add_four$results$place_id,
search_str_add_five$results$place_id)
lat <- c(search_str$results$geometry$location$lat,
search_str_add_one$results$geometry$location$lat,
search_str_add_two$results$geometry$location$lat,
search_str_add_three$results$geometry$location$lat,
search_str_add_four$results$geometry$location$lat,
search_str_add_five$results$geometry$location$lat)
lon <- c(search_str$results$geometry$location$lng,
search_str_add_one$results$geometry$location$lng,
search_str_add_two$results$geometry$location$lng,
search_str_add_three$results$geometry$location$lng,
search_str_add_four$results$geometry$location$lng,
search_str_add_five$results$geometry$location$lng)
# Creación del dataframe final
data <- data.frame(
business_name = business_name,
business_rating = business_rating,
user_ratings_total = user_ratings_total,
place_id = place_id,
lat = lat,
lon = lon
)
# Guardar el dataframe en un archivo CSV
# write.csv(data, "D:\\CD2001C_AD2024\\Power_BI_Data_AD2024\\data_location_intl.csv", row.names=TRUE)
# Eliminar duplicados por latitud y longitud
data <- data %>%
# Convertir coordenadas a numéricas (si no lo están)
mutate(across(c(lat, lon), as.numeric)) %>%
# Eliminar filas con coordenadas faltantes
drop_na(lat, lon) %>%
# Eliminar duplicados exactos en coordenadas
distinct(lat, lon, .keep_all = TRUE)
data_top_ratings <- data %>% slice_max(business_rating, n = 10)
data_low_ratings <- data %>% slice_min(business_rating, n = 10)
### Lets visualize the reviews information by coffee shop
top_ratings_plot <- ggplot(data_top_ratings, aes(x=reorder(business_name,business_rating), y=business_rating)) +
geom_bar(stat="identity", fill="lightblue") +
labs(title="Lavanderias - Top Ratings en Lavanderias", subtitle = "ZMM") +
coord_flip()
top_ratings_plot
low_ratings_plot <- ggplot(data_top_ratings, aes(x=reorder(business_name,user_ratings_total), y=user_ratings_total)) +
geom_bar(stat="identity", fill="lightblue") +
labs(title="Lavanderias - Top 10 Ratings de Usarios", subtitle = "ZMM") +
coord_flip()
low_ratings_plot
top_users_plot <- ggplot(data_low_ratings, aes(x=reorder(business_name,business_rating), y=business_rating)) +
geom_bar(stat="identity", fill="lightpink") +
labs(title="Lavanderiass - Top 10 más bajo de Ratings de Usarios", subtitle = "ZMM") +
coord_flip()
top_users_plot
low_users_plot <- ggplot(data_low_ratings, aes(x=reorder(business_name,user_ratings_total), y=user_ratings_total)) +
geom_bar(stat="identity", fill="lightpink")+
labs(title="Hospitales - Top 10 más bajos en Ratings del negocio", subtitle = "ZMM") +
coord_flip()
low_users_plot
### Mapa de calor of Top / Lowest Ratings en ZMM
# Crear mapa de calor con leaflet
heatmap <- leaflet(data_top_ratings) %>%
addTiles() %>% # Añadir capa base
addHeatmap(
lng = ~lon, lat = ~lat, # Coordenadas
intensity = ~business_rating, # Intensidad basada en el rating
blur = 20, # Difuminado del calor
max = 0.05, # Máximo valor de intensidad
radius = 15 # Radio de los puntos de calor
) %>%
setView(lng = median(data_top_ratings$lon), lat = median(data_top_ratings$lat), zoom = 12) %>%
addControl("Mapa de Calor - LAvanderias con Mejores Ratings", position = "topright")
# Mostrar mapa
heatmap
# Crear mapa de calor con leaflet
heatmap1 <- leaflet(data_low_ratings) %>%
addTiles() %>% # Añadir capa base
addHeatmap(
lng = ~lon, lat = ~lat, # Coordenadas
intensity = ~business_rating, # Intensidad basada en el rating
blur = 20, # Difuminado del calor
max = 0.05, # Máximo valor de intensidad
radius = 15 # Radio de los puntos de calor
) %>%
setView(lng = median(data_low_ratings$lon), lat = median(data_low_ratings$lat), zoom = 12) %>%
addControl("Mapa de Calor - Lavanderias con Peores Ratings", position = "topright")
# Mostrar mapa
heatmap1
## request more details about the hospitals using google_place_details()
reviews_top_ratings <- google_place_details(place_id = data_top_ratings$place_id[10], key = gmaps_key)
reviews_low_ratings <- google_place_details(place_id = data_low_ratings$place_id[10], key = gmaps_key)
# reviews_top_ratings$result$reviews$text
# reviews_low_ratings$result$reviews$text
### Generate a vector containing only the text
top_ratings_text <- reviews_top_ratings$result$reviews$text
top_ratings_doc <- Corpus(VectorSource(top_ratings_text))
low_ratings_text <- reviews_low_ratings$result$reviews$text
low_ratings_doc <- Corpus(VectorSource(low_ratings_text))
### Clean the text data
options(warn=-1)
top_ratings_doc <- top_ratings_doc %>% tm_map(removeNumbers) %>% tm_map(removePunctuation) %>% tm_map(stripWhitespace)
top_ratings_doc <- tm_map(top_ratings_doc, content_transformer(tolower))
top_ratings_doc <- tm_map(top_ratings_doc, removeWords, stopwords("english"))
options(warn=-1)
low_ratings_doc <- low_ratings_doc %>% tm_map(removeNumbers) %>% tm_map(removePunctuation) %>% tm_map(stripWhitespace)
low_ratings_doc <- tm_map(low_ratings_doc, content_transformer(tolower))
low_ratings_doc <- tm_map(low_ratings_doc, removeWords, stopwords("english"))
### Lets create a dataframe containing each word in the first column and their frequency in the second column.
options(warn=-1)
dtm_top <- TermDocumentMatrix(top_ratings_doc)
matrix_top <- as.matrix(dtm_top)
words_top <- sort(rowSums(matrix_top),decreasing=TRUE)
words_top_df <- data.frame(word = names(words_top),freq=words_top)
# write.csv(words_top_df, "D:\\CD2001C_AD2024\\Power_BI_Data_AD2024\\wordcloud_a.csv", row.names=TRUE)
options(warn=-1)
dtm_low <- TermDocumentMatrix(low_ratings_doc)
matrix_low <- as.matrix(dtm_low)
words_low <- sort(rowSums(matrix_low),decreasing=TRUE)
words_low_df <- data.frame(word = names(words_low),freq=words_low)
### We can now generate the word cloud according to the top and low ratings reviews.
set.seed(1234) # for reproducibility
### top ratings
# top_raiting_wc<-wordcloud(words = words_top_df$word, freq = words_top_df$freq, min.freq = 1, max.words=200, random.order=FALSE, rot.per=0.35, colors=brewer.pal(8, "Dark2"))
top_rating_wc <- wordcloud2(data = words_top_df, color = "random-dark", size = 0.6, shape = "circle")
top_rating_wc
### low ratings
#low_raiting_wc<-wordcloud(words = words_low_df$word, freq = words_low_df$freq, min.freq = 1, max.words=200, random.order=FALSE, rot.per=0.35, colors=brewer.pal(8, "Dark2"))
low_rating_wc <- wordcloud2(data = words_low_df, color = "random-dark", size = 0.6, shape = "circle")
low_rating_wc
## request more details about the restaurant using google_place_details()
hospitals_reviews <- google_place_details(place_id = data$place_id[19], language = "es", key = "AIzaSyDMtkIEvUpFccyUCGxLgXO-TzTcERfzJ3o")
hospitals_reviews
## $html_attributions
## list()
##
## $result
## $result$address_components
## long_name short_name
## 1 PLAZA LA AMISTAD PLAZA LA AMISTAD
## 2 302 302
## 3 Paseo de la Amistad P.º de la Amistad
## 4 Monte Real Monte Real
## 5 Ciudad General Escobedo Cdad. Gral. Escobedo
## 6 Nuevo León N.L.
## 7 México MX
## 8 66056 66056
## types
## 1 premise
## 2 street_number
## 3 route
## 4 sublocality_level_1, sublocality, political
## 5 locality, political
## 6 administrative_area_level_1, political
## 7 country, political
## 8 postal_code
##
## $result$adr_address
## [1] "PLAZA LA AMISTAD, <span class=\"street-address\">P.º de la Amistad 302</span>, <span class=\"extended-address\">Monte Real</span>, <span class=\"postal-code\">66056</span> <span class=\"locality\">Cdad. Gral. Escobedo</span>, <span class=\"region\">N.L.</span>, <span class=\"country-name\">México</span>"
##
## $result$business_status
## [1] "OPERATIONAL"
##
## $result$current_opening_hours
## $result$current_opening_hours$open_now
## [1] TRUE
##
## $result$current_opening_hours$periods
## close.date close.day close.time open.date open.day open.time
## 1 2025-03-16 0 2200 2025-03-16 0 0800
## 2 2025-03-17 1 2200 2025-03-17 1 0800
## 3 2025-03-18 2 2200 2025-03-18 2 0800
## 4 2025-03-12 3 2200 2025-03-12 3 0800
## 5 2025-03-13 4 2200 2025-03-13 4 0800
## 6 2025-03-14 5 2200 2025-03-14 5 0800
## 7 2025-03-15 6 2200 2025-03-15 6 0800
##
## $result$current_opening_hours$weekday_text
## [1] "lunes: 8:00–22:00" "martes: 8:00–22:00" "miércoles: 8:00–22:00"
## [4] "jueves: 8:00–22:00" "viernes: 8:00–22:00" "sábado: 8:00–22:00"
## [7] "domingo: 8:00–22:00"
##
##
## $result$formatted_address
## [1] "PLAZA LA AMISTAD, P.º de la Amistad 302, Monte Real, 66056 Cdad. Gral. Escobedo, N.L., México"
##
## $result$formatted_phone_number
## [1] "81 2261 1931"
##
## $result$geometry
## $result$geometry$location
## $result$geometry$location$lat
## [1] 25.79652
##
## $result$geometry$location$lng
## [1] -100.3173
##
##
## $result$geometry$viewport
## $result$geometry$viewport$northeast
## $result$geometry$viewport$northeast$lat
## [1] 25.79789
##
## $result$geometry$viewport$northeast$lng
## [1] -100.316
##
##
## $result$geometry$viewport$southwest
## $result$geometry$viewport$southwest$lat
## [1] 25.7952
##
## $result$geometry$viewport$southwest$lng
## [1] -100.3187
##
##
##
##
## $result$icon
## [1] "https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/generic_business-71.png"
##
## $result$icon_background_color
## [1] "#7B9EB0"
##
## $result$icon_mask_base_uri
## [1] "https://maps.gstatic.com/mapfiles/place_api/icons/v2/generic_pinlet"
##
## $result$international_phone_number
## [1] "+52 81 2261 1931"
##
## $result$name
## [1] "Lavanderia Lavaratic"
##
## $result$opening_hours
## $result$opening_hours$open_now
## [1] TRUE
##
## $result$opening_hours$periods
## close.day close.time open.day open.time
## 1 0 2200 0 0800
## 2 1 2200 1 0800
## 3 2 2200 2 0800
## 4 3 2200 3 0800
## 5 4 2200 4 0800
## 6 5 2200 5 0800
## 7 6 2200 6 0800
##
## $result$opening_hours$weekday_text
## [1] "lunes: 8:00–22:00" "martes: 8:00–22:00" "miércoles: 8:00–22:00"
## [4] "jueves: 8:00–22:00" "viernes: 8:00–22:00" "sábado: 8:00–22:00"
## [7] "domingo: 8:00–22:00"
##
##
## $result$photos
## height
## 1 4032
## 2 719
## 3 1440
## 4 1440
## 5 1440
## 6 1440
## 7 466
## 8 1836
## 9 276
## 10 2501
## html_attributions
## 1 <a href="https://maps.google.com/maps/contrib/104944698519373336290">Lavanderia Lavaratic</a>
## 2 <a href="https://maps.google.com/maps/contrib/104944698519373336290">Lavanderia Lavaratic</a>
## 3 <a href="https://maps.google.com/maps/contrib/118111389415377273199">Lavaratic Lavandería</a>
## 4 <a href="https://maps.google.com/maps/contrib/118111389415377273199">Lavaratic Lavandería</a>
## 5 <a href="https://maps.google.com/maps/contrib/118111389415377273199">Lavaratic Lavandería</a>
## 6 <a href="https://maps.google.com/maps/contrib/118111389415377273199">Lavaratic Lavandería</a>
## 7 <a href="https://maps.google.com/maps/contrib/104054494117136070126">Julio César Aguirre</a>
## 8 <a href="https://maps.google.com/maps/contrib/100182628975687921852">Lavanderías Lavaratic</a>
## 9 <a href="https://maps.google.com/maps/contrib/104944698519373336290">Lavanderia Lavaratic</a>
## 10 <a href="https://maps.google.com/maps/contrib/104054494117136070126">Julio César Aguirre</a>
## photo_reference
## 1 AUy1YQ32wvjot6RvtlDoq04181n2fgwfdvMEVT8zX8Rp2L4m65MS5-WlZUHN1Ib655r6HtnEIbc-KhZxnXRPIBxLENmEm6I3m2uCXt4ZvodnzwkYtUObplpLre02Vf0Z3OgrFPhSbW9m_tZ74xAtayFNoj7_ywApVIywLkwTzzNTXugoCD5lraoyyrBIWcv8V3YwfjLpTtEIeq1LkKjEPMgpu2PPWiX6wp27eVIu0jM6wkIzfC2zo6sVRTH1ju-aPcnjAfRGIJmf23LCSA6MkGkqNq_8M-8s-_uqO7R6jTAi4LmSU3ezQHvFba-W63Ps2ejFESqD5bgU4zo
## 2 AUy1YQ1n8_YeP5ddxPz93sSqi03CgeHe1uITr2c-cU9pw_6Nc7IYyBKBbw9B6cqNUlNT-xdAWa5_M5VdpkhVI23PmXjiMjb7NnA8EPRsTVRsLBjeLWlmC_gCI0GOeGWbztdlAooS_BAMfezIPHsFf4YEVZphI3q9rwCqUSZfKYlqpRSUSMXNnJsn4yej4s-cI0hmQOK7Gv9Ni0WQ3yOZpY_kkskQjUc81Zsfzhul1LNxqXmr759uWS7rNxRwADPJFiVWGPjA29HLT88h1QJAngDh0uxAcqeAeKOLKSTIs93YARPnmT0tc7OZ0h5qyZXKxlA31rOlvjgUVr0
## 3 AUy1YQ01ExzjyRCfY5Yn4CfdPOH9aBhlO1nH9UJ9zBofg8sa7guHMOLZSuxt1h5ITi0GI3wHBzxuf88Q7JTYhX-oIwUcfdCsTmxuA_wQr1CbskHMK8J2Kn1RKQ1h7HgmNsdR5FjpmQtKwTblMGOT-qrXSXNOXeBGfVwiJn1zIDfQ7bDP3_yIaVxysCbbAmrrqRLYnahO_cP5LNVpOAdlP6yKglMR1F6JkE5IO_oY41zzs93QHpMQr2q2-tJ8jdTZwNaKk6GLaUIXsd0VwE5Li5NQo0rkCtYzr0FnzzexSSdMYeVDa8z9s0BGzFyXVcLnJq351sIh4T9SJ0Y
## 4 AUy1YQ1khlqJAC2Y_6lJpyIGF2k8kM9wrwNal1ffbr2yRNM36BK_ceLcw1dn-4_LkYpCc1reIGYOZ3qKOHBdfGFSmW9MYigMOXBXKD23g8Y-V0bQvIo3VOu3DFQR202ShmTOfEVJOW4CyTm5dXHZie04_3RBIK-DSXEunuBsEFwSR1Nt7mAAK5wLEFAHtBX2gECiLmZGfDJY1rtsA3u_Hr-yxEKdvXxbEsylf6bqxxOR5bqi9_EOsBMbnMvGapOsfJJ-2rXGJ-RPjRvFoTG4rqD8LSzglnVTUB-Ny5l9_qhxv-tW1T7ZYRYTojcy9Xppbv16CKERD8sUcwg
## 5 AUy1YQ0opWleIHkehXI2UwLGVkFkrN18Q_t5-q-_g7oAlPifeEZfe8LWFOy1PgSSEbg3AEy1ynApBigSC0o9bWgdhmNjmT28SCLD8XTj6K29950qmNBAdIT5_NjyYTT79iTAEZ9N2ypcVgN4_vBuXHR7tkrdIce4AFP2gpFPIIWXI4uZ1jkMqfMA1Kzyc_U7m7mTayElG0KjwYvRw_G2eD0WdU6ATmjjgGVj2FeZmQAE0Zy8fCgyWRSgy7-o-vl81SxxbkXee8xSGE2gKGclv3a2RIsrt3m5Nb3uM4hV65DrIqRJimxDqQ3XLAbVfU23VeAkofBNWaWDHh0
## 6 AUy1YQ08r6NLu5jPUEhE7nEEouCPnFqXk1H8q_mveoB5-1eZuEcEC5jQlx8sSkS_YcVFH3WhSHH1tjaJT_x2maWHp8-WZXbnwBD7fed4YBTNYSBoaEGaHOSR1qiLgMI86aTPl9u5mgmRG-5igRsgL7Hi36M0CJCJH169y-OgOyamlUE017U2PNQnXKBFO8G38KnKlMFZaDB44oBn2zZhjy21cLLBGJgS-VqyfqOMmNawd8jTh4aqjqwwWr5ov-se8IYAcpXLuhx3DNgMmZzABCrhl9ncEfTE5QvqFyzovfBgrJ4VYavog0Ckb_EU9XVZN8errrRqGHTYq_M
## 7 AUy1YQ3U_xRp_ITUNJUX4VAbhnTCFFcn70PrAZVTuv3cT6Yg2H5T1C0fgYQvCOLAElFjk08Pp2aE8G5pkNqBOsavlUMabaUw_UdKG16MScSXAHoVlJ4YHevLi3_Plhu0vz9xbfUmP48hy9XBk1f_rAyDFN1fRA7CQ6Lj4tNk-Md5akGkwxI5nhk6Wu_5Fry3DGf7lXcX97kifkuPtKOFz230Wh6Dp2padNE2zNWt-lSb5baz9SHZ-pwD-_3wpCiQTVoDuDUdqKDhTqtvj2d3ljeFh8BZ0sB1b5CI78em2gtCIsBdQfTyHcQ0PbkXikhXQ-yFGZPPBXfy
## 8 AUy1YQ0IhOHFADGiQkP2JbYUrjlldgBhC6GBRYWmvKnGu21Dvh3_wOnjQK1kvr43Wgz076kmctWfmcANuQu4H4fC7a2ZwGKuRDWnL2d1AphfjkgJFv0wtZCVweILeb39FAtn0GzASeBnxbBIBrqA_QOZtLuyIlDfCe3XfbgGwRI9VH6EUsirVGVcJCmiEmYNhyeZRehqa4HYfWYdlK-l0CYMJaU4b87LjRUsUAP-Je-pMbxM9JMXI2OHpH3J2VDmCM6LVqHZJh23gsqrdP2-GIQBx95pGyBqAwpXa84mK-OxNZ9cYcPD1I6MLT5CiiFO7rQ06pdlAUKD48g
## 9 AUy1YQ0VhaqYxR79ylSBFZjxsr_bvi6i6p23837jw0xQHgDS6YMynYEYVIxZGu6h_SqlcYcCqgaCTdttMHVxI7LMnK4F1OZzOhtYhiCFGupv41kd9lirKmzUZdlrc-RVg1BfN3v39Yb5m0o-NDKWG1u9r9nN7_w1n0RTMpai6dc_7Igx78mDktF1Dj9z4guIDqbHkjH74wjCyKqB3lg72ET0d1NxJaY7CEs8afy7dbcaraoP3nfGmfLu6ols6wNh8s6E5CwqC1ZyoRr9mO0CaAJGPdO4P8bc-1BY9d0kpV3OkMCA0b1Xx_jvQUSPiDU0dwWHaiSi7yma
## 10 AUy1YQ3ft1QBKoDiqBwANgsrxT-XdHiMXB4vLZzwfN-Ni1VqA2AvVZjyuPfMr7yG1hC8qC0YLEsstAWREMe6NNVx1F6XOXf-UuzN3jBl3CtOYoOdYtqFlpkC3C_AngMV5plYaMnf4Ny2HlgQJlpbkwSD2a3R9sA9WvWYMzP4ZfJOku81qKFQQXIpBjJhRiUSB8KustQwaSpu8dXTZbJOzAygnMugm57o_aFJw3lkvFyeGhkekCu_K4oOdPL5f5OACWBm2uTAZwqVISuK7yb3wkIrkdHt-_PoaLzfFhIEl85disCCNRRNkv4VGcQMWzLPcnyD6tHKj7LD8Jk
## width
## 1 3024
## 2 638
## 3 2560
## 4 2560
## 5 2560
## 6 2560
## 7 1218
## 8 3264
## 9 1856
## 10 4096
##
## $result$place_id
## [1] "ChIJDbQuvLyTYoYRI8KmFf1hsIs"
##
## $result$plus_code
## $result$plus_code$compound_code
## [1] "QMWM+J3 Ciudad General Escobedo, N.L., México"
##
## $result$plus_code$global_code
## [1] "75QXQMWM+J3"
##
##
## $result$rating
## [1] 4.3
##
## $result$reference
## [1] "ChIJDbQuvLyTYoYRI8KmFf1hsIs"
##
## $result$reviews
## author_name
## 1 Jorge Aguirre
## 2 Raf Estevez
## 3 Maricela Aguirre Rodriguez
## 4 Lorena Villalobos
## 5 Julieta Segovia
## author_url language
## 1 https://www.google.com/maps/contrib/100429814595129404660/reviews es
## 2 https://www.google.com/maps/contrib/103991640933200221641/reviews es
## 3 https://www.google.com/maps/contrib/102046929592630981627/reviews es
## 4 https://www.google.com/maps/contrib/106043943107592312129/reviews es
## 5 https://www.google.com/maps/contrib/111566945030401963592/reviews es
## original_language
## 1 es
## 2 es
## 3 es
## 4 es
## 5 es
## profile_photo_url
## 1 https://lh3.googleusercontent.com/a/ACg8ocIuhV3Fw13TQLdJNCgyiIX6UkuPyJo5N0g95Me-BvmwkQ77gZ4=s128-c0x00000000-cc-rp-mo
## 2 https://lh3.googleusercontent.com/a-/ALV-UjWcIP528eY17k4naGie9K2Nfj4vhxCrFLJC4SRR0u51PgyRxbDj=s128-c0x00000000-cc-rp-mo-ba3
## 3 https://lh3.googleusercontent.com/a/ACg8ocJhVYMPfi1rPnBqO3fuSan2M6hUZrcoc9FlkORC-MO-oXM5dA=s128-c0x00000000-cc-rp-mo
## 4 https://lh3.googleusercontent.com/a/ACg8ocJT99ThxVJI0sbnWtQKV4tCeIu1DdPFhuXQ6iCbBhJzeD99xg=s128-c0x00000000-cc-rp-mo
## 5 https://lh3.googleusercontent.com/a-/ALV-UjU9dKdVe1G84p7d6t5z8bZuXoYicQ7DgXO8WdO6FCP8LwSZkqxd=s128-c0x00000000-cc-rp-mo-ba4
## rating relative_time_description
## 1 5 Hace 6 años
## 2 2 Hace un año
## 3 5 Hace 6 años
## 4 5 Hace 6 años
## 5 5 Hace 7 años
## text
## 1 Excelente servicio muy buena atención , hasta pude disfrutar de una película mientras lavaba mi ropa.. 100% recomendable volvere cada semana.
## 2 El teléfono de contacto que tienen está fuera de servicio, y tratándose de una lavandería creo que deberían informar en todas las redes sociales en las que se encuentran (como esta) si cuentan o no con servicio a domicilio.
## 3 excelente servicio buen precio y super amable atentos
## 4 Muy buena atencion me trataron super bien,\nVolvere pronto
## 5 Muy buen servicio y casi nunca está llena
## time translated
## 1 1548810594 FALSE
## 2 1700320704 FALSE
## 3 1547433333 FALSE
## 4 1547435897 FALSE
## 5 1490113253 FALSE
##
## $result$types
## [1] "point_of_interest" "laundry" "establishment"
##
## $result$url
## [1] "https://maps.google.com/?cid=10065652906790928931"
##
## $result$user_ratings_total
## [1] 19
##
## $result$utc_offset
## [1] -360
##
## $result$vicinity
## [1] "PLAZA LA AMISTAD, Paseo de la Amistad 302, Monte Real, Ciudad General Escobedo"
##
##
## $status
## [1] "OK"
### Based on google reviews lets generate a vector containing only the text.
### Corpus is a collection of written texts or a body of writing on a particular subject.
### Cleaning text data.
hospitals_reviews_text <- hospitals_reviews$result$reviews$text
hospitals_reviews_doc <- Corpus(VectorSource(hospitals_reviews_text)) ### Tokenization: Break down the text into individual words or tokens.
hospitals_reviews_doc <- hospitals_reviews_doc %>% tm_map(removeNumbers) %>% tm_map(removePunctuation) %>% tm_map(stripWhitespace)
hospitals_reviews_doc <- tm_map(hospitals_reviews_doc, content_transformer(tolower))
hospitals_reviews_doc <- tm_map(hospitals_reviews_doc, removeWords, stopwords("spanish"))
dtm_hospitals_reviews <- TermDocumentMatrix(hospitals_reviews_doc)
matrix <- as.matrix(dtm_hospitals_reviews)
words <- sort(rowSums(matrix),decreasing=TRUE)
words_df <- data.frame(word = names(words),freq=words)
### Word cloud of 50 hospitals around the starting point (Parque Fundidora)
### words with frequency below min.freq will not be plotted
# wordcloud(words = words_df$word, freq = words_df$freq, min.freq = 5, max.words=200, random.order=FALSE, rot.per=0.35, colors=brewer.pal(8, "Dark2"))
### Word clouds are a visual representation of text data where words are arranged in a cluster. The size of each word reflects its frequency or importance in the data set.
wordcloud2(data = words_df, color = "random-dark", size = 0.5, shape = "circle")
### term matrix is a table containing the frequency of the words.
tdm_sparse <- TermDocumentMatrix(hospitals_reviews_doc, control = list(weighting = weightTfIdf))
tdm_m_sparse <- as.matrix(tdm_sparse)
# lets display frequency of words
term_freq <- rowSums(tdm_m_sparse)
term_freq_sorted <- sort(term_freq, decreasing = TRUE)
tdm_d_sparse <- data.frame(word = names(term_freq_sorted), freq = term_freq_sorted)
# lets display the top 10 most frequent words
head(tdm_d_sparse, 10)
## word freq
## casi casi 0.4643856
## llena llena 0.4643856
## nunca nunca 0.4643856
## buen buen 0.4532325
## super super 0.3776937
## amable amable 0.3317040
## atentos atentos 0.3317040
## precio precio 0.3317040
## atencion atencion 0.3317040
## bien bien 0.3317040
# we create a word cloud to visualize the most frequent words in the google reviews
# wordcloud(words = tdm_d_sparse$word, freq = tdm_d_sparse$freq, min.freq = 4, max.words = 100, colors = brewer.pal(8, "Dark2"))
wordcloud2(data = tdm_d_sparse, color = "random-dark", size = 0.6, shape = "circle")
# lets convert review column of dataframe to character vector so we can generate sentiment scores
text <- iconv(hospitals_reviews_text)
# Syuzhet method: It is an algorithm used to extract emotions from a sentence. It is a sentiment lexicon that can be used with non-English texts.
syuzhet_vector <- get_sentiment(text, method = "syuzhet") ### get_sentiment() function is used to analyze sentiment.
# See first row of vector
head(syuzhet_vector)
## [1] 0.00 -0.25 0.75 0.75 0.00
# See summary statistics of vector
summary(syuzhet_vector)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.25 0.00 0.00 0.25 0.75 0.75
# Using the full dataset for sentiment analysis might still
text_sampled <- iconv(hospitals_reviews_text)
syuzhet_vector_sampled <- get_sentiment(text_sampled, method = "syuzhet")
ggplot(data.frame(syuzhet_vector_sampled), aes(x = syuzhet_vector_sampled)) +
geom_histogram(binwidth = 0.1, fill = "blue", color = "black") +
labs(title = "Sentiment Distribution using Syuzhet Method (Sampled Data)", x = "Sentiment Score", y = "Frequency") +
theme_minimal()
nrc_sampled <- get_nrc_sentiment(text_sampled) ### sentiment dictionary to calculate the presence of eight different emotions and their corresponding valence in a text file
nrct_sampled <- data.frame(t(nrc_sampled))
nrcs_sampled <- data.frame(rowSums(nrct_sampled))
nrcs_sampled <- cbind("sentiment" = rownames(nrcs_sampled), nrcs_sampled)
rownames(nrcs_sampled) <- NULL
names(nrcs_sampled)[1] <- "sentiment"
names(nrcs_sampled)[2] <- "frequency"
nrcs_sampled <- nrcs_sampled %>% mutate(percent = frequency/sum(frequency))
nrcs2_sampled <- nrcs_sampled[1:8, ]
colnames(nrcs2_sampled)[1] <- "emotion"
### 1) The bar plot illustrating the distribution of emotions based on sentiment analysis using the NRC lexicon on the google reviews
### 2) Each bar represents a different emotion, and the height of the bar indicates the frequency of that emotion within the text data.
ggplot(nrcs2_sampled, aes(x = reorder(emotion, -frequency), y = frequency, fill = emotion)) +
geom_bar(stat = "identity") +
labs(title = "Emotion Distribution (Sampled Data)", x = "Emotion", y = "Frequency") +
theme_minimal() +
scale_fill_brewer(palette = "Set3")
### 1) The output is a horizontal bar plot illustrating the frequency of the top 10 most popular words in the text data.
### 2) Each bar represents a word, and the length of the bar indicates the frequency of that word in the dataset.
tdm_d_sparse <- tdm_d_sparse[1:10, ]
tdm_d_sparse$word <- reorder(tdm_d_sparse$word, tdm_d_sparse$freq)
ggplot(tdm_d_sparse, aes(x = word, y = freq, fill = word)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(title = "Most Popular Words", x = "Word", y = "Frequency") +
theme_minimal()
# lets create a data frame with sentiment and count
sentiment_df <- data.frame(sentiment = c("Positive", "Negative", "Neutral"), count = c(sum(syuzhet_vector_sampled > 0), sum(syuzhet_vector_sampled < 0), sum(syuzhet_vector_sampled == 0)))
# Create a pie chart
### 1) The output is a pie chart illustrating the distribution of sentiment categories within the dataset.
### 2) Each segment of the pie chart represents a sentiment category (“Positive”, “Negative”, “Neutral”), and the size of each segment corresponds to the count of that sentiment category in the dataset.
ggplot(sentiment_df, aes(x = "", y = count, fill = sentiment)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y", start = 0) +
labs(title = "Sentiment Distribution", x = "", y = "") +
theme_minimal() +
scale_fill_brewer(palette = "Set3")
Google Maps Reviews y análisis de sentimientos (sentiment analysis). - Comparando los mapas de calor las mejores lavanderías se encuentran a las afueras de monterrey, mientras que las peores son mas centricas.
“La Lavandería” resalta al lograr posicionarse entrar en la mejores lavanderías en dos sucursales diferentes.
Las reseñas se encuentran polarizadas, hay una mayor cantidad de palabras detectadas como positivas.
Trust es el sentimiento más presente en las reseñas.
Las palabras más repetidas en las reseñas se encuentran “húmedos, caro y pésimo”.
Hay un buen balance entre la cantidad de reseñas y su calificaciones. Esto muestra que los resultados de las lavanderias no estan sesgados y pueden ser representativos para analizar.
• ¿Cuáles son las unidades de negocio con los niveles de rating más altos? ¿Cuál es la ubicación de las unidades de negocio con los mejores ratings? ¿Cuáles son las principales características locacionales, “reviews”, y de tipo de percepción por parte de los clientes?
Son lavanderias presentes en las afueras de la zona metropolitana, estas se encuentra en guadalupe, apodaca y monterrey sur
• ¿Cuáles son las unidades de negocio con los niveles de rating más bajos? ¿Cuál es la ubicación de las unidades de negocio con los menores ratings? ¿Cuáles son las principales características locacionales y “reviews” por parte de los clientes?
Son lavanderias presentes en zona céntrica de la zona metropolitana, estas se encuentra en San Pedro y Monterrey Centro
• ¿Qué tipo de sentimiento describen los comentarios de Google Reviews de las unidades de negocio seleccionadas?
Principalmente trust, pero tambien se muestra anger, fear, surprise y sadness
Las lavanderías mejor calificadas están en zonas periféricas, mientras que las peores se concentran en áreas céntricas de la zona metropolitana de Monterrey.
“Trust” es el sentimiento dominante, lo que indica que los clientes valoran la fiabilidad y calidad del servicio.
Las palabras más frecuentes en las reseñas sugieren áreas de oportunidad para mejorar el servicio, especialmente en la percepción de precios y calidad.
Las lavendrias pueden implantar esta solución para hacer un beanchmarking según el análisis sentimental. De esta manera se puede expandir en estrategias para mejorarreputación, considerar nuevos puntos de ventas y personalizar su servicio al cliente.
La percepción negativa en zonas céntricas podría estar relacionada con mayores volúmenes de clientes, lo que puede afectar la calidad del servicio y la experiencia del usuario.