This assignment provides several visualization about comparison each of restaurants that are having low prices and highest rating according to zomato datasets from kaggle. 3 kind of visualization mostly based on maps will be served in this project: 1. Visualization using leaflet 2. Visualization using ggplot and geom_point 3. Visualization using ggplot and geom_polygon
Before started, we need some packages in order to help us to visualize our data and manipulate our data.
library(tidyverse)
library(leaflet)
library(glue)
library(ggthemes)
library(ggrepel)
In this case, we’re gonna use zomato restaurants datasets from kaggle. We’re gonna utilize some important columns like Latitude and Longitude to visualize at the maps and several numerical columns such as aggregate rating, price range, and average cost for two people.
df <- read.csv("../zomato-restaurants-data/zomato.csv")
str(df)
## 'data.frame': 9551 obs. of 21 variables:
## $ Restaurant.ID : int 6317637 6304287 6300002 6318506 6314302 18189371 6300781 6301290 6300010 6314987 ...
## $ Restaurant.Name : Factor w/ 7446 levels " Let's Burrrp",..: 3757 3183 2903 4718 5521 2074 1011 7190 6053 3821 ...
## $ Country.Code : int 162 162 162 162 162 162 162 162 162 162 ...
## $ City : Factor w/ 141 levels "\xdb\xc1stanbul",..: 75 75 77 77 77 77 96 96 96 97 ...
## $ Address : Factor w/ 8918 levels "\xed\x88ukurambar Mahallesi, Muhsin YazÛ±cÛ±o\xdb\xf4lu Caddesi, No 3, \xed\x88ankaya, Ankara",..: 8691 6058 4683 8696 8695 5400 3986 3985 6978 3966 ...
## $ Locality : Factor w/ 1208 levels " ILD Trade Centre Mall, Sohna Road",..: 173 599 313 1001 1001 1001 1000 1000 1005 523 ...
## $ Locality.Verbose : Factor w/ 1265 levels " ILD Trade Centre Mall, Sohna Road, Gurgaon",..: 174 607 319 1054 1054 1054 1053 1053 1058 529 ...
## $ Longitude : num 121 121 121 121 121 ...
## $ Latitude : num 14.6 14.6 14.6 14.6 14.6 ...
## $ Cuisines : Factor w/ 1826 levels "","Afghani","Afghani, Mughlai, Chinese",..: 922 1113 1673 1128 1124 499 136 1683 798 894 ...
## $ Average.Cost.for.two: int 1100 1200 4000 1500 1500 1000 2000 2000 6000 1100 ...
## $ Currency : Factor w/ 12 levels "Botswana Pula(P)",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Has.Table.booking : Factor w/ 2 levels "No","Yes": 2 2 2 1 2 1 2 2 2 2 ...
## $ Has.Online.delivery : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ Is.delivering.now : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ Switch.to.order.menu: Factor w/ 1 level "No": 1 1 1 1 1 1 1 1 1 1 ...
## $ Price.range : int 3 3 4 4 4 3 4 4 4 3 ...
## $ Aggregate.rating : num 4.8 4.5 4.4 4.9 4.8 4.4 4 4.2 4.9 4.8 ...
## $ Rating.color : Factor w/ 6 levels "Dark Green","Green",..: 1 1 2 1 1 2 2 2 1 1 ...
## $ Rating.text : Factor w/ 6 levels "Average","Excellent",..: 2 2 6 2 2 6 6 6 2 2 ...
## $ Votes : int 314 591 270 365 229 336 520 677 621 532 ...
By using real maps we can engage more user to use our application. This kind of visualization helps user to find their own restaurant based on detail information such as:
We’re not gonna compete with all observation in datasets, but we’re only gonna use several restaurants data in Indonesia, Jakarta. That’s why first we need to subset our datasets so that we only have few data.
indonesian.restaurants <- df[df$Country.Code == 94 & df$Latitude != 0 & df$Longitude != 0 & df$City == "Jakarta",]
head(indonesian.restaurants)
## Restaurant.ID Restaurant.Name Country.Code City
## 9280 7422633 Talaga Sampireun 94 Jakarta
## 9281 7405789 Toodz House 94 Jakarta
## 9282 18425821 OJJU 94 Jakarta
## 9283 7422751 Union Deli 94 Jakarta
## 9284 7402935 Skye 94 Jakarta
## 9285 7410290 Satoo - Hotel Shangri-La 94 Jakarta
## Address
## 9280 Jl. Lingkar Luar Barat
## 9281 Jl. Cipete Raya No. 79, Fatmawati, Jakarta
## 9282 Gandaria City, Lantai Upper Ground, Jl. Sultan Iskandar Muda
## 9283 Grand Indonesia Mall, Lantai Ground, East Mall, Jl. MH Thamrin, Thamrin, Jakarta
## 9284 Menara BCA, Lantai 56, Jl. MH. Thamrin, Thamrin, Jakarta
## 9285 Hotel Shangri-La, Jl. Jend. Sudirman
## Locality Locality.Verbose
## 9280 Cengkareng Cengkareng, Jakarta
## 9281 Fatmawati Fatmawati, Jakarta
## 9282 Gandaria City Mall, Gandaria Gandaria City Mall, Gandaria, Jakarta
## 9283 Grand Indonesia Mall, Thamrin Grand Indonesia Mall, Thamrin, Jakarta
## 9284 Grand Indonesia Mall, Thamrin Grand Indonesia Mall, Thamrin, Jakarta
## 9285 Hotel Shangri-La, Sudirman Hotel Shangri-La, Sudirman, Jakarta
## Longitude Latitude
## 9280 106.7285 -6.168467
## 9281 106.8018 -6.278012
## 9282 106.7832 -6.244221
## 9283 106.8197 -6.197150
## 9284 106.8220 -6.196778
## 9285 106.8190 -6.203292
## Cuisines
## 9280 Sunda, Indonesian
## 9281 Cafe, Italian, Coffee and Tea, Western, Indonesian
## 9282 Korean
## 9283 Desserts, Bakery, Western
## 9284 Italian, Continental
## 9285 Asian, Indonesian, Western
## Average.Cost.for.two Currency Has.Table.booking
## 9280 200000 Indonesian Rupiah(IDR) No
## 9281 165000 Indonesian Rupiah(IDR) No
## 9282 200000 Indonesian Rupiah(IDR) No
## 9283 200000 Indonesian Rupiah(IDR) No
## 9284 800000 Indonesian Rupiah(IDR) No
## 9285 800000 Indonesian Rupiah(IDR) No
## Has.Online.delivery Is.delivering.now Switch.to.order.menu
## 9280 No No No
## 9281 No No No
## 9282 No No No
## 9283 No No No
## 9284 No No No
## 9285 No No No
## Price.range Aggregate.rating Rating.color Rating.text Votes
## 9280 3 4.9 Dark Green Excellent 1662
## 9281 3 4.6 Dark Green Excellent 1476
## 9282 3 3.9 Yellow Good 137
## 9283 3 4.6 Dark Green Excellent 903
## 9284 3 4.1 Green Very Good 1498
## 9285 3 4.6 Dark Green Excellent 873
To summarize all of them, we’re gonna use leaflet to visualize our data. By using this, several important information can be obtained by using this apps:
https://zomato.comleaflet(indonesian.restaurants) %>% addTiles() %>%
addMarkers(~Longitude, ~Latitude, popup = pops)
This visualization helps user to see which restaurants will be suitable for users in for instance Austrailia, Indonesia, and Singapore. In this case we’re gonna visualize several restaurants that have low prices and high rates at 3 countries that mentioned above. Color in each of point represents price range each of restaurants from 1 to 5.
scale_this <- function(x){(x-min(x))/(max(x)-min(x))}
df %>% filter(Country.Code %in% c(184, 14, 94)) %>%
mutate(Country.Code = ifelse(Country.Code == 94, "Indonesia", ifelse(Country.Code == 184, "Singapore", "Australia"))) %>%
group_by(Country.Code) %>%
mutate(Average.Cost.for.two.scale = scale_this(Average.Cost.for.two), Aggregate.rating.scale = scale_this(Aggregate.rating)) %>%
ungroup() %>%
filter(Average.Cost.for.two.scale < 0.2, Aggregate.rating.scale > 0.25) %>%
ggplot(aes(x = Aggregate.rating.scale, y = Average.Cost.for.two.scale, color=Price.range)) +
geom_point(aes(size=Price.range),show.legend = F, alpha=0.5) +
scale_color_gradient(low="red3", high="green2") +
geom_label_repel(aes(label=Restaurant.Name), size= 3, box.padding = 0.2)+
facet_wrap(~Country.Code, ncol=1) +
labs(
title = "Cheap and Top Restaurants",
x = "Aggregate Rating",
y = "Cost for 2 People",
subtitle = "In Australia, Indonesia, and Singapore") +
theme(legend.position = "none", axis.text.x = element_text(hjust = 1)) +
theme_linedraw()
This visualization serves user with static maps that contains point or coordinates each of restaurants at one of the choosing country. It will show a lot of restaurants with some detail information like rating and price range in maps area. Size of the points means rating that given by user and color represents price range in every restaurants. In this case we’re gonna see comparison restaurants at Australia based on rating and price range.
shows <- function(df, xmin, xmax, ymin, ymax) {
global <- map_data("world") #World longitude and latitude data
ggplot() +
geom_polygon(data = global, aes(x=long, y = lat, group = group),
fill = "gray85", color = "gray80") + xlim(xmin, xmax) + ylim(ymin, ymax) + coord_fixed(1.3) + geom_point(data = df, aes(x = Longitude, y = Latitude), size = df$Aggregate.rating,
color = df$Price.range, alpha = 0.5) + geom_label_repel(aes(label=df$Restaurant.Name, x=df$Longitude, y = df$Latitude), size= 3, box.padding = 0.2) + labs(title = "Comparison Aggregate Rating and Price Range in Australia Restaurants") +
scale_colour_gradient(low="red3", high="green2") + theme_map()
}
shows(filter(df, Country.Code == 14),110, 180, -10, -40)
Visualization with maps are useful tools for some moderns application and will facilitate user to have best experience related to searching and filtering. By using maps, we can get easier to compare some restaurants data in some places and looks like more attractive to see rather than in bar or another visualization charts.