Dataset

This dataset, courtesy of Kaggle, is a directory for food places registered with Zomato in Bangalore, Karnataka. The data includes their names, contact information, address & location, cuisines served, type of place, rating, number of votes and approximate cost among other fields. The dataset used for data representation is a subset of this much larger dataset. Only highly rated places (4.5 to 5 out 5) were rated into consideration. For the sake of these visualisations, we will assume that the number of votes is indicative of the number of people eating at a certain restaurant.

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(stringr)

# reading main file
z <- read.csv("Mansi_Zomato_Blr_Dataset_18Dec2020.csv")

# limiting the data to relevant fields
zomato <- select(z, name, online_order, 
                 book_table, rate, votes, 
                 location,
                 rest_type, dish_liked, cuisines,
                 approx_cost.for.two.people., 
                 listed_in.city.)

# renaming column names
c_names <- c("name", "online_order", "book_table", "rate",
             "votes", "location", "rest_type",
             "dish_liked", "cuisines", 
             "approx_cost_for_two", "listed_in_city")

colnames(zomato) <- c_names

# limiting the data set further to only highly rated restaurants, i.e, 
# between 4.5 and 5
high_rate <- filter(zomato,
                    rate>=4.5,
                    rate<=5)

z_new <- high_rate

zomato_limited <- read.csv("Mansi_Zomato_Limited_Dataset_18Dec2020.csv")

All over the place

Of some 1250 records of restaurants in different locations, only 163 are unique, the rest are restaurant chains sprouting up all over Bangalore. Following is a chart mapping the distribution of these restaurants over Bangalore, categorised on the basis of their provision to order food online. Kormangala is undoubtedly the most restaurant populated area in Bangalore.

# Graph 1
# Distribution of Restaurants over Bangalore

temp2 <- table(zomato_limited$listed_in_city, zomato_limited$online_order)

# converting table to data frame
temp2.table.df <- as.data.frame(temp2)

c_names <- c("city", "online_order", "frequency")
colnames(temp2.table.df) <- c_names
colnames(temp2.table.df)
## [1] "city"         "online_order" "frequency"
# plotting graph
g1 <- ggplot(data=temp2.table.df, aes(x=frequency, y=city))
g1 + geom_bar(stat="identity", fill=rgb(0.1,0.4,0.5,0.7)) + 
  facet_wrap( ~ online_order) + xlab("No. of Restaurants") + ylab("Location") +
  labs(title = "Distribution of High Rated Restaurants in Bangalore", 
       subtitle = "On the Basis of the Provision to Order Online")

What’s in a rate?

Whimsy is a splendid thing, and it is unfair to believe the quality of food and service alone contribute to the ratings food places observe. These are a few attempts at understanding if and/or how different elements affect these ratings.

Too expensive?

The following chart represents the rating for all the highly rated restaurants against their cost, categorized on the basis of their provision to book a table and order food online. The really expensive restaurants do not seem to favour online ordering. The low-priced restaurants tend to not allow booking of tables. Maximum number of restaurants observe a rating of either 4.5 or 4.6.

# Graph 2
# Cost vs. Rate

g2 <- ggplot(zomato_limited, aes(x=rate, y=approx_cost_for_two, 
                                 color= as.factor(book_table)))
g2 + geom_point(alpha = 0.2) + facet_wrap(~ online_order) + 
  scale_fill_brewer(palette = "Set2") +
  xlab("Rate") + ylab("Approximate Cost for Two") +
  labs(title = "Restaurant Expense vs. Rating", 
       subtitle = "On the Basis of the Provision to Order Online", 
       color = "Allows Booking Tables")

Where’s the crowd at?

This visualisation checks the distribution of people over restaurants of varying costs. Rating is also checked as a factor. Most restaurants are rated 4.5 and 4.6, as established earlier. Few people are eating at very expensive hotels.

# Graph 3
# Votes vs. Cost

g3 <- ggplot(zomato_limited, aes(x=votes, y=approx_cost_for_two, 
                                 color= as.factor(rate)))
g3 + geom_point(alpha = 0.2) + facet_wrap(~ rate)+
  scale_fill_brewer(palette = "Set2") + xlab("No. of Votes") + 
  ylab("Approx. Cost for Two People") + 
  labs(title = "Restaurants People are Eating at and their Costs", 
       subtitle = "On the Basis of their Rating") + 
  theme(legend.position="none")

Wine and Dine?

The ratings all the different kinds of food places observe against their average cost and number of people eating at them. Fine diners are the costliest, no surprise there. They aren’t all that popular with people though. Casual diners, on the other hand seem like the preference. They are averagely priced. Drinks and Bars are everyone’s favourites of course.

# Graph 4
# Restaurant vs. Cost

z_smry1 <- zomato_limited %>% 
  group_by(rest_type) %>% 
  summarise(avg_cost = mean(approx_cost_for_two), avg_rate = mean(rate), 
            avg_votes = (votes))
## `summarise()` regrouping output by 'rest_type' (override with `.groups` argument)
z_smry1.df <- as.data.frame(z_smry1)

# plotting graph
g4 <- ggplot(z_smry1.df, aes(x=avg_cost, y=rest_type, 
                             fill= as.factor(round(avg_rate, 1))))
g4 + geom_bar(stat="identity") + xlab("Average Cost") + 
  ylab("Type of Restaurant") + 
  labs(title = "Average Cost of Different Kinds of Restaurants", 
       fill = "Average Rating") +
  scale_fill_brewer(palette = "Set2")

g6 <- ggplot(z_smry1.df, aes(x=avg_votes, y=rest_type, 
                             fill= as.factor(round(avg_rate, 1))))
g6 + geom_bar(stat="identity") + xlab("Average No. of People Eating") + 
  ylab("Type of Restaurant") + 
  labs(title = "Average Cost of Different Kinds of Restaurants", 
       fill = "Average Rating") +
  scale_fill_brewer(palette = "Set2")

##“I will eat you” From a list of all the available cuisines, the most widely available ones are plotted against the number of people who enjoy them, so their popularity. Yes, burgers are a whole cuisine. And the favourite. Desserts are rated the highest, they’re closest to the heart.

# Graph 5
# Number of People Eating the Most Widely Available Cuisines

z_smry_cuisine <- zomato_limited %>% 
  group_by(cuisines) %>% 
  summarise(avg_cost = mean(approx_cost_for_two), 
            avg_rate = mean(rate), avg_people = mean(votes))
## `summarise()` ungrouping output (override with `.groups` argument)
temp <- table(zomato_limited$cuisines)
temp.df <- as.data.frame(temp)

top_cuisine <- filter(temp.df, table(zomato_limited$cuisines)>=20)

name <- c("cuisines", "freq")
colnames(top_cuisine) <- name 

# combining two data sets 
cuisine <- inner_join(top_cuisine, z_smry_cuisine)
## Joining, by = "cuisines"
# plotting graph
g5 <- ggplot(cuisine, aes(x=avg_people, y=cuisines, 
                          fill = as.factor(round(avg_rate, 1))))
g5 + geom_bar(stat="identity") + xlab("Average No. of People Eating") + 
  ylab("Most Widely Available Cuisines") + 
  labs(title = "Number of People Eating the Most Widely Available Cuisines", 
       fill = "Average Rating") +
  scale_fill_brewer(palette = "Set2")