Sentiment Analysis of Italian Restaurants in Cincinnati
Here I will be doing a simple sentiment analysis of two Italian restaurants in Cincinnati. Pepp & Delores and Via Vite. These are two of my favorites in Cincinnati. I will be comparing their reviews and the sentiment that these reviews hold for these restaurants. This data was collected using reviews of these restaurants from OpenTable.
If you would like to down your own work with my gathered data, you can retrieve it here:
cincinnati_italian.csv
Most Common Words in Reviews
Here I will be looking at the most common words in reviews. It seems that overall the most common outcomes are positive. For both restaurants there are many positive words such as excellent, delicious, amazing, and wonderful. Some words that could hold good or bad sentiments are service, server, food, time. We will look deeper into this in the next section.
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Warning: package 'tidytext' was built under R version 4.3.3
Warning: package 'textdata' was built under R version 4.3.3
Joining with `by = join_by(word)`
# A tibble: 20 × 3
# Groups: Restaurant.Name [2]
Restaurant.Name word n
<chr> <chr> <int>
1 Via Vite food 1824
2 Pepp & Delores food 1700
3 Via Vite service 1200
4 Pepp & Delores service 1048
5 Pepp & Delores amazing 575
6 Via Vite excellent 559
7 Pepp & Delores pasta 546
8 Pepp & Delores delicious 542
9 Via Vite vite 531
10 Pepp & Delores time 477
11 Pepp & Delores server 473
12 Via Vite time 472
13 Via Vite server 463
14 Via Vite restaurant 462
15 Via Vite experience 446
16 Via Vite delicious 433
17 Via Vite menu 409
18 Pepp & Delores experience 402
19 Pepp & Delores excellent 369
20 Pepp & Delores wonderful 317
How do the above indifferent words equate to review score?
Here we will look at the average score based on the words I classified as indifferent above. It seems that overall these words tend to be used in a positive sense. The average overall scores for reviews that include the words, server, service, food, & time are all above a rating of 4. Previously I thought there may have been a negative connotation such as bad service or long wait times. But these restaurant seems to perform well in those categories.
What are the most common negative words in reviews?
From the data it seems that the most common words all tend to be positive. But where could these restaurants improve? What are the most used negative words? Some of the most common negative words based on the NRC lexicon are:
Small: This most likely can be attributed to either the size of the restaurant, meaning longer wait times, or small portion sizes.
Wait: I assume this would have to do with wait times. Most likely attributed to waiting for a table or food.
Noise: this word is common but could be an outstanding factor. The restaurant itself could be noisy, or potentially there was construction nearby, it is hard to tell.
Bad: The word bad is commonly attributed to food and service. Most likely these customers did not enjoy either their meal or the service they received.
Cold: Cold can either be attributed to the temperature of the restaurant itself or the food. Depending on the time of year and weather it could be restaurant temperature, but I believe it sways more towards the food being delivered to the customer cold.
Wrong: here there is one thing for this to be attributed to, it seems that people have received the wrong thing at the restaurant. Something else other than they ordered or a meal that was delivered with the wrong alterations.
Warning in data("nrc"): data set 'nrc' not found
Warning in inner_join(., nrc, by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1 of `x` matches multiple rows in `y`.
ℹ Row 5143 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
"many-to-many"` to silence this warning.
# A tibble: 20 × 3
# Groups: Restaurant.Name [2]
Restaurant.Name word n
<chr> <chr> <int>
1 Pepp & Delores outstanding 210
2 Via Vite outstanding 200
3 Pepp & Delores wait 191
4 Via Vite wait 121
5 Via Vite disappointed 109
6 Via Vite small 91
7 Pepp & Delores small 66
8 Pepp & Delores disappointed 58
9 Via Vite bad 57
10 Via Vite noise 51
11 Via Vite cold 44
12 Via Vite limited 34
13 Pepp & Delores disappoint 33
14 Via Vite wrong 33
15 Via Vite disappointing 32
16 Pepp & Delores lower 30
17 Pepp & Delores bad 29
18 Pepp & Delores lemon 28
19 Pepp & Delores noise 27
20 Pepp & Delores wrong 26