Summary: This analysis will look into the dieting options as well investigate what improvements should be made to Olive Garden Oakley, OH
Background:
Olive Garden is a national Italian restaurant chain with over 900 locations in the USA. Nutritionix is a website that shows the nutritional information for chain restaurants, which includes Olive Garden. The data we will be using is the olive garden menu nutritional data that includes amounts like calories, protein, sugars, and trans fat for each menu item. Included in this data frame is their catering nutritional information as well.
The key to this analysis will be to look at the nutritional makeup of the menu and subcategories of the menu. This includes the Drink, Catering, and Main menu.
I find this analysis to be interesting because I like to search at fast food chains “healthiness” as I have lots of friends and family who either have dieted or have dietary restrictions.
By table scraping Nutritionix and using the http elements in yelp I was able to create a table that includes every menu item including catering, drinks, and the kids menu. This table is different than the yelp table that holds the 149 reviews. We will use these two tables to visualize the analysis of Olive Garden
The Menu
First step, to make the data set easier to filter I want to make dichotomous variables that say whether or not items are regular Gluten free or Drinks. I also want a categorical variable that says if the the item is on the “Main” menu or is it a To Go, Catering, or Kids menu item. I can also make a column that identifies if the item is a drink, and if the item is labeled as Gluten free.
This above box plots shows the calorie counts for the different “Menus”. Sides is considered another menu because it would skew the main menu due to it being smaller plates. As expected the Catering menu, which serves anywhere from 3 to 8 people, has the highest average and calorie distribution.
Main Menu
I want to see what the main menus distribution of nutritional value is. Using the below chart we can see that one of the most concerning aspects of the main menu is the sodium levels. I am surprised at the amount of average carbs. Further investigation could be had to look at the distribution of carbs in the menu and what is lowering the average. I assumed that carbs would be high, due to lots of the menu items being either fried or containing a pasta.
Drinks
Olive Garden provides 5 drinks that are zero calories, and they are coffee, tea, diet coke, coke zero, and water. The rest of the drink options range around cocktails, wine, sodas, and juices. There are lots of drinks and therefore it is important to know how calories distribute along the drink menu. Especially if you are trying to consume a low calorie diet.
Noodles
Of course Olive Garden is known for their pasta, and the below shows the frequency of menu items of each type of noodle per menu. Spaghetti is their most frequently appearing pasta shape, while the least is tortellini.
Catering
The last menu to look at is the catering menu. This menu contains only bulk foods and drinks, and I was curious to see the sodium and calorie amounts for these items after seeing the distribution of the main menu. We can see that is one outlier which would be the “Create your own Pasta Station” which is meant for 10 people. The rest of the catering items range on the amount of people they feed.
Reviews
While the menu of every Olive Garden is the same, it is important to look at specific locations to see how the restaurant chain is performing. These are 149 yelp reviews for the Olive Garden in Oakley, OH.
Pairing up Words
It is important to look at what combination of words are used to describe this location. Pairing up the words create a better way to interpret the important parts of the reviews. The below web shows that the connection between food service is mention lots in the reviews. Other words that are mentioned a lot are salad, table, time, server, and experience. Combinations of these words can help us interpret that what customers remember are the words mentioned at least 20 times. However, the below does not show us the sentiment of the reviews.
Sentiment Cloud
Using the afinn word list we are able to find what words that have sentiment significance are mentioned and how often. The sentiments surrounding this Olive Garden Location are mixed. The largest words being bad, nice, pretty, love, and recommend are giving a conflicting review of the location. Customers could be saying that they do or do not recommend the location. While this cloud shows us the rounded sentiment of the location, we can categorize the sentiment create a more certain opinion of the location.
Opinion
Using the below chart we can see that there are more positive sentiments in the reviews than any of the other categories. The reviews also include sentiments of joy, anticipation, and trust which could be due to the consistency coming from the chain nature of the restaurant.
Concluding Thoughts
There is a sense of consistency and comfortability found in an Olive Garden. The analysis showed that in one restaurant you are able to choose from many menus and each one can be analyzed with their own nutritional make up. Olive Garden is not meant for someone looking or a low calorie or sodium diet, however there many options that do not include pastas and there is a wide variety of drink options. Through looking at the reviews we can see an overall positive feel for the restaurant, however specific aspects like service and wait times may be focus points for further improvements.
Improvements for Next Analysis
Next time I would group up the additions, sides, and sauces. The menu had all of these items as a menu item and could be a cause for averages being lower than expected. If the “Create your own pasta” was better put on the menu then it would have been easier to do the analysis on the noodles. You can’t classify pasta as a noodle even though there is a item named Pasta Fagoli due to the name of the noodle is ditalini and is not included in the menu items name.
Source Code
---title: "Analyzing Olive Garden with R"author: "Erin McLaughlin"editor: visualtoc: true # Generates an automatic table of contents.format: # Options related to formatting. html: # Options related to HTML output. code-tools: TRUE # Allow the code tools option showing in the output. embed-resources: TRUE # Embeds all components into a single HTML file. execute: # Options related to the execution of code chunks. warning: FALSE # FALSE: Code chunk sarnings are hidden by default. message: FALSE # FALSE: Code chunk messages are hidden by default. echo: FALSE # TRUE: Show all code in the output.---##### **Summary**: *This analysis will look into the dieting options as well investigate what improvements should be made to Olive Garden Oakley, OH*### **Background:**Olive Garden is a national Italian restaurant chain with over 900 locations in the USA. Nutritionix is a website that shows the nutritional information for chain restaurants, which includes Olive Garden. The data we will be using is the olive garden menu nutritional data that includes amounts like calories, protein, sugars, and trans fat for each menu item. Included in this data frame is their catering nutritional information as well.The key to this analysis will be to look at the nutritional makeup of the menu and subcategories of the menu. This includes the Drink, Catering, and Main menu.I find this analysis to be interesting because I like to search at fast food chains "healthiness" as I have lots of friends and family who either have dieted or have dietary restrictions.###### Packages required: tidyverse, dplyr, ggplot2, stringr, lubridate, tidytext, ggraph, ggwordcloud, widyr, igraph```{r}#| label: Setting Up Your Data#| include: FALSElibrary(tidyverse)library(dplyr)library(ggplot2)library(knitr)library(stringr)library(lubridate) # Easily fixing pesky dateslibrary(tidytext) # Tidy text mininglibrary(textdata) # Lexicons of sentiment datalibrary(ggraph) library(ggwordcloud)library(widyr) library(igraph) # Special graphs for network analysis```### **Data in Use**By table scraping Nutritionix and using the http elements in yelp I was able to create a table that includes every menu item including catering, drinks, and the kids menu. This table is different than the yelp table that holds the 149 reviews. We will use these two tables to visualize the analysis of Olive Garden```{r}#| label: Labeling Data Frames#| include: FALSEmenu_items<-read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/mclaughline3_xavier_edu/ESKwBxcp1yBDp1XSfxteXnYBrRaYEonKdY82ZuIdjCTP0A?download=1")reviews<-read_csv("https://www.dropbox.com/scl/fi/pyf76pr6jtp1i0qqe4v6x/erin_italian_restaurants.csv?rlkey=3ygt0ndiyznhtgnsqnf7xeo61&dl=1")```### **The Menu**First step, to make the data set easier to filter I want to make dichotomous variables that say whether or not items are regular Gluten free or Drinks. I also want a categorical variable that says if the the item is on the "Main" menu or is it a To Go, Catering, or Kids menu item. I can also make a column that identifies if the item is a drink, and if the item is labeled as Gluten free.```{r}#| label: categorizing the data#| include: FALSE#make Drinks columnmenu_items <- menu_items %>%mutate(is_drink =ifelse( (row_number() >=142&row_number() <=196) | (row_number() >=256&row_number() <=264), TRUE, FALSE ))# Initialize the 'gluten_free' column with FALSE as the default valuemenu_items$gluten_free <-FALSE# Loop through the rows and check for the word "gluten"for (i in1:nrow(menu_items)) {if (grepl("Gluten", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$gluten_free[i] <-TRUE }}# Initialize the 'Category' column with "Main" as the default valuemenu_items$Category <-"Main"# Loop through the rows and check for the words "Catering", "To Go", or "Kid"for (i in1:nrow(menu_items)) {if (grepl("Catering", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$Category[i] <-"Catering" } elseif (grepl("To Go", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$Category[i] <-"To Go" } elseif (grepl("Kid", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$Category[i] <-"Kid" } elseif (grepl("Gallon", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$Category[i] <-"Catering" } elseif (grepl("Side", menu_items$`Menu Item`[i], ignore.case =TRUE)) { menu_items$Category[i] <-"Side" }}``````{r}#| label: visualize menu calories#| include: TRUE#| menu_items %>%ggplot(aes(x = Category, y = Calories, fill = Category)) +geom_boxplot() +labs(title ="Distribution of Calories by Category", x ="Category", y ="Calories") +ylim(0, 8000)+scale_fill_brewer(palette ="Set3")```This above box plots shows the calorie counts for the different "Menus". Sides is considered another menu because it would skew the main menu due to it being smaller plates. As expected the Catering menu, which serves anywhere from 3 to 8 people, has the highest average and calorie distribution.#### **Main Menu**I want to see what the main menus distribution of nutritional value is. Using the below chart we can see that one of the most concerning aspects of the main menu is the sodium levels. I am surprised at the amount of average carbs. Further investigation could be had to look at the distribution of carbs in the menu and what is lowering the average. I assumed that carbs would be high, due to lots of the menu items being either fried or containing a pasta.```{r}#| label: Main Menu#| include: TRUEmain_num<-menu_items %>%filter(Category=="Main"& is_drink==FALSE) %>%summarise(`Sodium`=mean(Sodium, na.rm =TRUE),`Calories`=mean(Calories, na.rm =TRUE),`Calories From Fat`=mean(`Calories from Fat`, na.rm =TRUE),`Total Fat`=mean(`Total Fat`, na.rm =TRUE),`Sarurated Fat`=mean(`Saturated Fat`, na.rm =TRUE),`Trans Fat`=mean(`Trans Fat`, na.rm =TRUE),`Cholesterol`=mean(Cholesterol, na.rm =TRUE),`Total Carbs`=mean(`Total Carbs`, na.rm =TRUE),`Fiber`=mean(`Dietary Fiber`, na.rm =TRUE),`Sugar`=mean(Sugars, na.rm =TRUE),`Protein`=mean(Protein, na.rm =TRUE) ) %>%arrange(desc(Sodium))# Gather the summary data into a long format for ggplotmain_num_done <- main_num%>% tidyr::gather(key ="Nutrient", value ="Average", `Sodium`, `Calories`, `Calories From Fat`, `Total Fat`, `Sarurated Fat`,`Trans Fat`, `Cholesterol` ,`Total Carbs`, `Fiber`, `Sugar`, `Protein`)# Create the bar plotmain_num_done %>%ggplot(aes(x = Nutrient, y = Average, fill = Nutrient)) +geom_bar(stat ="identity") +labs(title ="Averages of Nutritional Values in the Main Menu", x ="Nutrient", y ="Average Value") +theme_minimal()+scale_fill_brewer(palette ="Paired")+theme(axis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees )```#### **Drinks**Olive Garden provides 5 drinks that are zero calories, and they are coffee, tea, diet coke, coke zero, and water. The rest of the drink options range around cocktails, wine, sodas, and juices. There are lots of drinks and therefore it is important to know how calories distribute along the drink menu. Especially if you are trying to consume a low calorie diet.```{r}#| label: drinks#| include: TRUEmenu_items %>%filter(Category =="Main", is_drink ==TRUE) %>%ggplot(aes(x = Calories)) +geom_histogram(fill ="#80b1d3",col ="#ffffb3")+# in the olive garden green and brownlabs(title ="Calorie Distribution of Drinks", x="Calories", y ="How Many Drinks have this Calorie Amount?" )```#### **Noodles**Of course Olive Garden is known for their pasta, and the below shows the frequency of menu items of each type of noodle per menu. Spaghetti is their most frequently appearing pasta shape, while the least is tortellini.```{r}#| label: noodles#| include: TRUEnoodles<-c("Ziti", "Spaghetti", "Tortelloni", "Lasagna", "Ravioli","Gnocchi", "Fettuccine")menu_items$noodle<-"NA"menu_items <- menu_items %>%mutate(noodle =case_when(str_detect(`Menu Item`, "Ziti") ~"Ziti",str_detect(`Menu Item`, "Spaghetti") ~"Spaghetti",str_detect(`Menu Item`, "Tortelloni") ~"Tortelloni",str_detect(`Menu Item`, "Lasagna") ~"Lasagna",str_detect(`Menu Item`, "Ravioli") ~"Ravioli",str_detect(`Menu Item`, "Gnocchi") ~"Gnocchi",str_detect(`Menu Item`, "Fettuccine") ~"Fettuccine",TRUE~"NA"# If no match, assign NA ))menu_items %>%filter(noodle !="NA") %>%# Exclude rows where noodle is "NA"group_by(Category, noodle) %>%# Group by noodle typesummarise(n =n()) %>%# Count occurrences of each noodle typeggplot(aes(x = noodle, y = n, fill = Category)) +# Fill based on noodle typegeom_bar(stat ="identity") +# Use stat="identity" to plot countslabs(title ="Distribution of Noodles", x ="Type of Noodle", y ="Count") +scale_fill_brewer(palette ="Paired")+theme_minimal() %>%theme(axis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees )```#### **Catering**The last menu to look at is the catering menu. This menu contains only bulk foods and drinks, and I was curious to see the sodium and calorie amounts for these items after seeing the distribution of the main menu. We can see that is one outlier which would be the "Create your own Pasta Station" which is meant for 10 people. The rest of the catering items range on the amount of people they feed.```{r}#| label: Catering#| include: TRUEmenu_items %>%filter(Category =="Catering") %>%ggplot(aes(x = Calories, y = Sodium)) +# Set the aesthetics for x and ygeom_point(aes(size = Calories, alpha =0.5), # Adjust size based on Calories and alpha transparencycolor ="#8dd3c7") +# Set the point colorlabs(title ="Calories vs Sodium for Catering Menu Items", x ="Calories", y ="Sodium") +theme_minimal() +scale_size_continuous(range =c(2, 8)) # Adjust the size range of the points```## **Reviews**While the menu of every Olive Garden is the same, it is important to look at specific locations to see how the restaurant chain is performing. These are 149 yelp reviews for the Olive Garden in Oakley, OH.```{r}#| label: review cleaning#| include: FALSEreviews<- reviews %>%filter(restaurant=="olive_garden")reviews$date<-mdy(reviews$review_date)reviews <- reviews %>%arrange(date) %>%mutate(review_id =row_number())#Give the Reviews an ID reviews <- reviews %>%arrange(date) %>%mutate(review_id =row_number())``````{r}#| label: positive words#| include: TRUE# find the most common positive words in the reviewstidy_reviews <- reviews %>%unnest_tokens(word, review_content) %>%anti_join(stop_words)afinn <-get_sentiments("afinn")```#### **Pairing up Words**It is important to look at what combination of words are used to describe this location. Pairing up the words create a better way to interpret the important parts of the reviews. The below web shows that the connection between food service is mention lots in the reviews. Other words that are mentioned a lot are salad, table, time, server, and experience. Combinations of these words can help us interpret that what customers remember are the words mentioned at least 20 times. However, the below does not show us the sentiment of the reviews.```{r}#| label: bing#| include: TRUEbing <-get_sentiments("bing")review_word_pairs <- tidy_reviews %>%group_by(restaurant) %>%pairwise_count(item = word, # The token vector to count pairs offeature = review_id,# The document vector within which to countupper =FALSE) %>%# Include duplicate pairwise combinationsarrange(-n)set.seed(1234)review_word_pairs %>%filter(!item1 =="olive", !item2 =="olive",!item1 =="garden", !item2 =="garden",) %>%ungroup() %>%select(!restaurant) %>%filter(n>=20) %>%graph_from_data_frame() %>%ggraph(layout ="fr") +# "fr" is a type of network graphgeom_edge_link(aes(edge_alpha = n, edge_width = n), edge_colour ="#33a02c") +geom_node_point(size =3) +geom_node_text(aes(label = name), repel =TRUE,point.padding =unit(0.2, "lines")) +theme_void()```#### **Sentiment Cloud**Using the afinn word list we are able to find what words that have sentiment significance are mentioned and how often. The sentiments surrounding this Olive Garden Location are mixed. The largest words being bad, nice, pretty, love, and recommend are giving a conflicting review of the location. Customers could be saying that they do or do not recommend the location. While this cloud shows us the rounded sentiment of the location, we can categorize the sentiment create a more certain opinion of the location.```{r}#| label: afinn#| include: TRUEafinn <-get_sentiments("afinn")review_counts <- tidy_reviews %>%group_by(word) %>%summarise(n=n()) %>%inner_join(afinn)review_counts %>%group_by(word) %>%filter(n >=10) %>%ggplot(aes(label = word, size = n, color = n)) +# Color by frequency 'n'geom_text_wordcloud() +scale_fill_brewer(palette ="Paired") +# Apply color palette from RColorBrewerscale_color_viridis_c() +# Color scale for words, based on frequencytheme_minimal()```#### **Opinion**Using the below chart we can see that there are more positive sentiments in the reviews than any of the other categories. The reviews also include sentiments of joy, anticipation, and trust which could be due to the consistency coming from the chain nature of the restaurant.```{r}#| label: nrc#| include: TRUEnrc <-get_sentiments("nrc")tidy_reviews %>%inner_join(nrc, by ="word", relationship ="many-to-many") %>%group_by(sentiment) %>%summarize(n =n()) %>%ggplot(aes(x = sentiment, y = n, fill = sentiment)) +scale_fill_brewer(palette ="Paired")+geom_bar(stat ="identity") +labs(title ="Olive Garden Sentiment Scores",subtitle ="Total number of emotive words scored ",y ="Total Number of Words",x ="Emotional Sentiment",fill ="Reviews Say")```## **Concluding Thoughts**There is a sense of consistency and comfortability found in an Olive Garden. The analysis showed that in one restaurant you are able to choose from many menus and each one can be analyzed with their own nutritional make up. Olive Garden is not meant for someone looking or a low calorie or sodium diet, however there many options that do not include pastas and there is a wide variety of drink options. Through looking at the reviews we can see an overall positive feel for the restaurant, however specific aspects like service and wait times may be focus points for further improvements.###### **Improvements for Next Analysis**Next time I would group up the additions, sides, and sauces. The menu had all of these items as a menu item and could be a cause for averages being lower than expected. If the "Create your own pasta" was better put on the menu then it would have been easier to do the analysis on the noodles. You can't classify pasta as a noodle even though there is a item named Pasta Fagoli due to the name of the noodle is ditalini and is not included in the menu items name.