For this project, I am looking at this dataset I found on Kaggle (https://www.kaggle.com/jackywang529/michelin-restaurants). It features Michelin Star restaurants award stars in the year 2019. I wanted to see what insights could be drawn from this data, specifically, how diverse are the cuisines of these restaurants? In what countries did most restaurants get stars that year? Which restaurants are the most expensive and how is price correlated with the number of stars for these restaurants? To carry out my analysis, I will ask three questions:
For this project, I have to disclaim that I only have available restaurants that were awarded stars in 2019, and thus I cannot compare them with restaurants that have been awarded in the past. In addition, past restaurants will not be featured in the data I will be looking at today. Due to the narrow scope of this dataset, the insight I derive from my analysis may be limited, and I will not draw lofty conclusions about all Michelin Star restaurants, only the ones that were awarded in 2019.
Lets start by loading the necessary packages.
library(tidyverse)
library(ggplot2)
library(dplyr)
library(maps)
Now lets load the data.
onestar <- read.csv("/Users/binhnguyen/Desktop/Stats32/one-star-michelin-restaurants.csv")
twostar <- read.csv("/Users/binhnguyen/Desktop/Stats32/two-stars-michelin-restaurants.csv")
threestar <- read.csv("/Users/binhnguyen/Desktop/Stats32/three-stars-michelin-restaurants.csv")
Lets look over our data.
head(onestar)
## name year latitude longitude city region zipCode
## 1 Kilian Stuba 2019 47.34858 10.17114 Kleinwalsertal Austria 87568
## 2 Pfefferschiff 2019 47.83787 13.07917 Hallwang Austria 5300
## 3 Esszimmer 2019 47.80685 13.03409 Salzburg Austria 5020
## 4 Carpe Diem 2019 47.80001 13.04006 Salzburg Austria 5020
## 5 Edvard 2019 48.21650 16.36852 Wien Austria 1010
## 6 Das Loft 2019 48.21272 16.37931 Wien Austria 1020
## cuisine price
## 1 Creative $$$$$
## 2 Classic cuisine $$$$$
## 3 Creative $$$$$
## 4 Market cuisine $$$$$
## 5 Modern cuisine $$$$
## 6 Modern cuisine $$$$$
## url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6 https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
head(twostar)
## name year latitude longitude city region
## 1 SENNS.Restaurant 2019 47.83636 13.06389 Salzburg Austria
## 2 Ikarus 2019 47.79536 13.00695 Salzburg Austria
## 3 Mraz & Sohn 2019 48.23129 16.37637 Wien Austria
## 4 Konstantin Filippou 2019 48.21056 16.37996 Wien Austria
## 5 Silvio Nickol Gourmet Restaurant 2019 48.20558 16.37693 Wien Austria
## 6 Steirereck im Stadtpark 2019 48.20229 16.38050 Wien Austria
## zipCode cuisine price
## 1 5020 Creative $$$$$
## 2 5020 Creative $$$$$
## 3 1200 Creative $$$$$
## 4 1010 Modern cuisine $$$$$
## 5 1010 Modern cuisine $$$$$
## 6 1030 Creative $$$$$
## url
## 1 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/senns-restaurant
## 2 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/ikarus
## 3 https://guide.michelin.com/at/en/vienna/wien/restaurant/mraz-sohn
## 4 https://guide.michelin.com/at/en/vienna/wien/restaurant/konstantin-filippou
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/silvio-nickol-gourmet-restaurant
## 6 https://guide.michelin.com/at/en/vienna/wien/restaurant/steirereck-im-stadtpark
head(threestar)
## name year latitude longitude city region
## 1 Amador 2019 48.25406 16.35915 Wien Austria
## 2 Manresa 2019 37.22761 -121.98071 South San Francisco California
## 3 Benu 2019 37.78521 -122.39876 San Francisco California
## 4 Quince 2019 37.79762 -122.40337 San Francisco California
## 5 Atelier Crenn 2019 37.79835 -122.43586 San Francisco California
## 6 The French Laundry 2019 38.40443 -122.36474 San Francisco California
## zipCode cuisine price
## 1 1190 Creative $$$$$
## 2 95030 Contemporary $$$$
## 3 94105 Asian $$$$
## 4 94133 Contemporary $$$$
## 5 94123 Contemporary $$$$
## 6 94599 Contemporary $$$$
## url
## 1 https://guide.michelin.com/at/en/vienna/wien/restaurant/amador
## 2 https://guide.michelin.com/us/en/california/south-san-francisco/restaurant/manresa
## 3 https://guide.michelin.com/us/en/california/san-francisco/restaurant/benu
## 4 https://guide.michelin.com/us/en/california/san-francisco/restaurant/quince
## 5 https://guide.michelin.com/us/en/california/san-francisco/restaurant/atelier-crenn
## 6 https://guide.michelin.com/us/en/california/san-francisco/restaurant/the-french-laundry
Looks good, but these are three separate dataframes. We want to combine them into one. But first, we want to add an extra column to each one to indicate the number of stars.
onestar$star <- 1
twostar$star <- 2
threestar$star <- 3
Lets see what they look like now.
head(onestar)
## name year latitude longitude city region zipCode
## 1 Kilian Stuba 2019 47.34858 10.17114 Kleinwalsertal Austria 87568
## 2 Pfefferschiff 2019 47.83787 13.07917 Hallwang Austria 5300
## 3 Esszimmer 2019 47.80685 13.03409 Salzburg Austria 5020
## 4 Carpe Diem 2019 47.80001 13.04006 Salzburg Austria 5020
## 5 Edvard 2019 48.21650 16.36852 Wien Austria 1010
## 6 Das Loft 2019 48.21272 16.37931 Wien Austria 1020
## cuisine price
## 1 Creative $$$$$
## 2 Classic cuisine $$$$$
## 3 Creative $$$$$
## 4 Market cuisine $$$$$
## 5 Modern cuisine $$$$
## 6 Modern cuisine $$$$$
## url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6 https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
## star
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
head(twostar)
## name year latitude longitude city region
## 1 SENNS.Restaurant 2019 47.83636 13.06389 Salzburg Austria
## 2 Ikarus 2019 47.79536 13.00695 Salzburg Austria
## 3 Mraz & Sohn 2019 48.23129 16.37637 Wien Austria
## 4 Konstantin Filippou 2019 48.21056 16.37996 Wien Austria
## 5 Silvio Nickol Gourmet Restaurant 2019 48.20558 16.37693 Wien Austria
## 6 Steirereck im Stadtpark 2019 48.20229 16.38050 Wien Austria
## zipCode cuisine price
## 1 5020 Creative $$$$$
## 2 5020 Creative $$$$$
## 3 1200 Creative $$$$$
## 4 1010 Modern cuisine $$$$$
## 5 1010 Modern cuisine $$$$$
## 6 1030 Creative $$$$$
## url
## 1 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/senns-restaurant
## 2 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/ikarus
## 3 https://guide.michelin.com/at/en/vienna/wien/restaurant/mraz-sohn
## 4 https://guide.michelin.com/at/en/vienna/wien/restaurant/konstantin-filippou
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/silvio-nickol-gourmet-restaurant
## 6 https://guide.michelin.com/at/en/vienna/wien/restaurant/steirereck-im-stadtpark
## star
## 1 2
## 2 2
## 3 2
## 4 2
## 5 2
## 6 2
head(threestar)
## name year latitude longitude city region
## 1 Amador 2019 48.25406 16.35915 Wien Austria
## 2 Manresa 2019 37.22761 -121.98071 South San Francisco California
## 3 Benu 2019 37.78521 -122.39876 San Francisco California
## 4 Quince 2019 37.79762 -122.40337 San Francisco California
## 5 Atelier Crenn 2019 37.79835 -122.43586 San Francisco California
## 6 The French Laundry 2019 38.40443 -122.36474 San Francisco California
## zipCode cuisine price
## 1 1190 Creative $$$$$
## 2 95030 Contemporary $$$$
## 3 94105 Asian $$$$
## 4 94133 Contemporary $$$$
## 5 94123 Contemporary $$$$
## 6 94599 Contemporary $$$$
## url
## 1 https://guide.michelin.com/at/en/vienna/wien/restaurant/amador
## 2 https://guide.michelin.com/us/en/california/south-san-francisco/restaurant/manresa
## 3 https://guide.michelin.com/us/en/california/san-francisco/restaurant/benu
## 4 https://guide.michelin.com/us/en/california/san-francisco/restaurant/quince
## 5 https://guide.michelin.com/us/en/california/san-francisco/restaurant/atelier-crenn
## 6 https://guide.michelin.com/us/en/california/san-francisco/restaurant/the-french-laundry
## star
## 1 3
## 2 3
## 3 3
## 4 3
## 5 3
## 6 3
Now its time to combine the datasets.
allstar <- rbind(onestar, twostar, threestar)
Unsurprisingly, we can see that there are way more three and two star restaurants than one star.
nrow(allstar)
## [1] 695
nrow(subset(allstar, star==1))
## [1] 549
nrow(subset(allstar, star==2))
## [1] 110
nrow(subset(allstar, star==3))
## [1] 36
How do the spread of one, two, and three star restaurants vary between countries? Let’s find out.
First, we want to sort our data by countries. However, our data is sorted into “regions”, and in the US, those “regions” are states and sometimes even cities. We want all of that to be just the US. In addition, some big foreign cities are also listed as regions, we want to replace those with it’s country’s name.
head(allstar, region=California)
## name year latitude longitude city region zipCode
## 1 Kilian Stuba 2019 47.34858 10.17114 Kleinwalsertal Austria 87568
## 2 Pfefferschiff 2019 47.83787 13.07917 Hallwang Austria 5300
## 3 Esszimmer 2019 47.80685 13.03409 Salzburg Austria 5020
## 4 Carpe Diem 2019 47.80001 13.04006 Salzburg Austria 5020
## 5 Edvard 2019 48.21650 16.36852 Wien Austria 1010
## 6 Das Loft 2019 48.21272 16.37931 Wien Austria 1020
## cuisine price
## 1 Creative $$$$$
## 2 Classic cuisine $$$$$
## 3 Creative $$$$$
## 4 Market cuisine $$$$$
## 5 Modern cuisine $$$$
## 6 Modern cuisine $$$$$
## url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4 https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6 https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
## star
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
So now, we’re going to create a new column called ‘countries’ in which we mush all of the states into one.
allstar <- allstar %>% mutate(country = region)
allstar$country[allstar$country=="Alabama"] <- "USA"
allstar$country[allstar$country=="Alaska"] <- "USA"
allstar$country[allstar$country=="Arizona"] <- "USA"
allstar$country[allstar$country=="Arkansas"] <- "USA"
allstar$country[allstar$country=="California"] <- "USA"
allstar$country[allstar$country=="Colorado"] <- "USA"
allstar$country[allstar$country=="Connecticut"] <- "USA"
allstar$country[allstar$country=="Delaware"] <- "USA"
allstar$country[allstar$country=="Florida"] <- "USA"
allstar$country[allstar$country=="Georgia"] <- "USA"
allstar$country[allstar$country=="Hawaii"] <- "USA"
allstar$country[allstar$country=="Idaho"] <- "USA"
allstar$country[allstar$country=="Illinois"] <- "USA"
allstar$country[allstar$country=="Indiana"] <-"USA"
allstar$country[allstar$country=="Iowa"] <- "USA"
allstar$country[allstar$country=="Kansas"] <- "USA"
allstar$country[allstar$country=="Kentucky"] <- "USA"
allstar$country[allstar$country=="Louisiana"] <- "USA"
allstar$country[allstar$country=="Maine"] <- "USA"
allstar$country[allstar$country=="Maryland"] <- "USA"
allstar$country[allstar$country=="Massachusetts"] <- "USA"
allstar$country[allstar$country=="Michigan"] <- "USA"
allstar$country[allstar$country=="Minnesota"] <- "USA"
allstar$country[allstar$country=="Mississippi"] <- "USA"
allstar$country[allstar$country=="Missouri"] <- "USA"
allstar$country[allstar$country=="Montana"] <- "USA"
allstar$country[allstar$country=="Nebraska"] <- "USA"
allstar$country[allstar$country=="Nevada"] <- "USA"
allstar$country[allstar$country=="New Hampshire"] <- "USA"
allstar$country[allstar$country=="New Jersey"] <- "USA"
allstar$country[allstar$country=="New Mexico"] <- "USA"
allstar$country[allstar$country=="New York"] <- "USA"
allstar$country[allstar$country=="North Carolina"] <- "USA"
allstar$country[allstar$country=="North Dakota"] <- "USA"
allstar$country[allstar$country=="Ohio"] <- "USA"
allstar$country[allstar$country=="Oklahoma"] <- "USA"
allstar$country[allstar$country=="Oregon"] <- "USA"
allstar$country[allstar$country=="Pennsylvania"] <- "USA"
allstar$country[allstar$country=="Rhode Island"] <- "USA"
allstar$country[allstar$country=="South Carolina"] <- "USA"
allstar$country[allstar$country=="South Dakota"] <- "USA"
allstar$country[allstar$country=="Tennessee"] <- "USA"
allstar$country[allstar$country=="Texas"] <- "USA"
allstar$country[allstar$country=="Utah"] <- "USA"
allstar$country[allstar$country=="Vermont"] <- "USA"
allstar$country[allstar$country=="Virginia"] <-"USA"
allstar$country[allstar$country=="Washington"] <- "USA"
allstar$country[allstar$country=="West Virginia"] <- "USA"
allstar$country[allstar$country=="Wisconsin"] <- "USA"
allstar$country[allstar$country=="Wyoming"] <- "USA"
allstar$country[allstar$country=="Delaware"] <- "USA"
allstar$country[allstar$country=="Washington DC"] <- "USA"
allstar$country[allstar$country=="New York City"] <- "USA"
allstar$country[allstar$country=="Chicago"] <- "USA"
allstar$country[allstar$country=="Taipei"] <- "Taiwan"
allstar$country[allstar$country=="Rio de Janeiro"] <- "Brazil"
allstar$country[allstar$country=="Sao Paulo"] <- "Brazil"
allstar$country[allstar$country=="Macau"] <- "China"
allstar$country[allstar$country=="United Kingdom"] <- "UK"
Now that we have all of that sorted out, lets take a look at our question #1.
First, let’s look at which countries have the highest number of Michelin star restaurants.
allstar$country %>%
unique() %>%
length()
## [1] 20
michelin_count <-
allstar %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
michelin_count %>% arrange(desc(count))
## # A tibble: 20 x 2
## country count
## <chr> <int>
## 1 USA 202
## 2 UK 162
## 3 Hong Kong 63
## 4 Singapore 39
## 5 Denmark 28
## 6 South Korea 26
## 7 Thailand 26
## 8 Taiwan 24
## 9 Sweden 22
## 10 Austria 19
## 11 China 19
## 12 Brazil 18
## 13 Ireland 14
## 14 Norway 8
## 15 Finland 6
## 16 Hungary 6
## 17 Croatia 5
## 18 Greece 4
## 19 Czech Republic 2
## 20 Poland 2
It’s the US by a landslide! Now how about countries that have the highest number of three-star, two-star, and one-star Michelin restaurants.
#three stars
three_star <- allstar %>% filter(star ==3)
three_star$country %>%
unique() %>%
length()
## [1] 10
three_count <-
three_star %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 10, wt = count)
three_count %>% arrange(desc(count))
## # A tibble: 10 x 2
## country count
## <chr> <int>
## 1 USA 14
## 2 Hong Kong 7
## 3 UK 5
## 4 China 3
## 5 South Korea 2
## 6 Austria 1
## 7 Denmark 1
## 8 Norway 1
## 9 Sweden 1
## 10 Taiwan 1
#two stars
two_star <- allstar %>% filter(star ==2)
two_star$country %>%
unique() %>%
length()
## [1] 15
two_count <-
two_star %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 15, wt = count)
two_count %>% arrange(desc(count))
## # A tibble: 15 x 2
## country count
## <chr> <int>
## 1 USA 33
## 2 UK 19
## 3 Hong Kong 12
## 4 Austria 6
## 5 China 5
## 6 Denmark 5
## 7 Singapore 5
## 8 South Korea 5
## 9 Sweden 5
## 10 Taiwan 5
## 11 Thailand 4
## 12 Brazil 3
## 13 Greece 1
## 14 Hungary 1
## 15 Ireland 1
#one stars
one_star <- allstar %>% filter(star ==1)
one_star$country %>%
unique() %>%
length()
## [1] 20
one_count <-
one_star %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
one_count %>% arrange(desc(count))
## # A tibble: 20 x 2
## country count
## <chr> <int>
## 1 USA 155
## 2 UK 138
## 3 Hong Kong 44
## 4 Singapore 34
## 5 Denmark 22
## 6 Thailand 22
## 7 South Korea 19
## 8 Taiwan 18
## 9 Sweden 16
## 10 Brazil 15
## 11 Ireland 13
## 12 Austria 12
## 13 China 11
## 14 Norway 7
## 15 Finland 6
## 16 Croatia 5
## 17 Hungary 5
## 18 Greece 3
## 19 Czech Republic 2
## 20 Poland 2
Now lets plot it.
world_map = map_data("world")
michelin_map <- world_map %>%
left_join(michelin_count, by = c("region" = "country"))
ggplot() +
geom_polygon(data=michelin_map, aes(x=long, y=lat, group=group, fill= count)) +
scale_fill_distiller(palette = "YlOrBr", direction = 1) +
coord_quickmap() + labs(title = "2019 Michelin Star Restaurants by Country") +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank())
From this dataset, we can see that only 20 countries were awarded Michelin Stars in the year 2019. And it seems that the US and the UK takes the lead, while other countries trail much behind.
Next we want to look at what the diversity of cuisines in Michelin Star restaurants in 2019.
allstar$cuisine %>%
unique() %>%
length()
## [1] 70
##just to get rid of some repetition in the data
allstar$cuisine[allstar$cuisine=="Modern cuisine"] <- "Contemporary"
cuisine_data <-
allstar %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
cuisine_data %>% arrange(desc(count))
## # A tibble: 22 x 2
## cuisine count
## <chr> <int>
## 1 Contemporary 183
## 2 Japanese 54
## 3 Creative 46
## 4 Cantonese 40
## 5 Modern British 38
## 6 French 29
## 7 Innovative 28
## 8 Italian 21
## 9 French contemporary 19
## 10 Sushi 17
## # … with 12 more rows
#three stars
three_star$cuisine[three_star$cuisine=="Modern cuisine"] <- "Contemporary"
three_star$cuisine %>%
unique() %>%
length()
## [1] 15
three_cuisine <-
three_star %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 16, wt = count)
three_cuisine %>% arrange(desc(count))
## # A tibble: 15 x 2
## cuisine count
## <chr> <int>
## 1 Contemporary 12
## 2 Cantonese 4
## 3 Creative 3
## 4 French 3
## 5 French contemporary 2
## 6 Japanese 2
## 7 Korean 2
## 8 American 1
## 9 Asian 1
## 10 Chinese 1
## 11 Classic French 1
## 12 Innovative 1
## 13 Italian 1
## 14 Seafood 1
## 15 Sushi 1
#two stars
two_star$cuisine[two_star$cuisine=="Modern cuisine"] <- "Contemporary"
two_star$cuisine %>%
unique() %>%
length()
## [1] 27
two_cuisine <-
two_star %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 28, wt = count)
two_cuisine %>% arrange(desc(count))
## # A tibble: 27 x 2
## cuisine count
## <chr> <int>
## 1 Contemporary 27
## 2 Creative 14
## 3 Japanese 8
## 4 French 7
## 5 French contemporary 7
## 6 Innovative 6
## 7 Cantonese 5
## 8 Modern British 4
## 9 Sushi 4
## 10 creative 3
## # … with 17 more rows
#one stars
one_star$cuisine[one_star$cuisine=="Modern cuisine"] <- "Contemporary"
one_star$cuisine %>%
unique() %>%
length()
## [1] 66
one_cuisine <-
one_star %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 67, wt = count)
one_cuisine %>% arrange(desc(count))
## # A tibble: 66 x 2
## cuisine count
## <chr> <int>
## 1 Contemporary 144
## 2 Japanese 44
## 3 Modern British 34
## 4 Cantonese 31
## 5 Creative 29
## 6 Innovative 21
## 7 French 19
## 8 Italian 19
## 9 Classic cuisine 14
## 10 Californian 13
## # … with 56 more rows
Now lets plot this data.
ggplot(data=cuisine_data, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="orange") +
labs(title="Most Popular Cuisine in Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") +
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))
“Contemporary” restaurants seem to be most common within Michelin Star restaurants awarded in 2019. How does this differ by star count?
Lets look at data for three, two, and one star.
##Three stars
ggplot(data=three_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="pink") +
labs(title="Most Popular Cuisine in THREE STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") +
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))
##two stars
ggplot(data=two_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="light blue") +
labs(title="Most Popular Cuisine in TWO STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") +
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))
##one star
ggplot(data=one_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="light green") +
labs(title="Most Popular Cuisine in ONE STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") +
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))
We can see here that “Contemporary” restaurants remain the most common at all three star counts. However, its not surprising that the selection for three stars in much smaller. The selection of restaurants for two star restaurants is more diverse, but we can see that there is a great variety of one star restaurants serving diverse cuisines. Contemporary remains, by far, the most common, but there is a long tail in which we can see a great variety of other cuisines. This could mean that the standards for getting one Michelin star are much less rigorous, and that three star restaurants occupy a certain niche in of itself wherein certain cuisines may be considered more prestigious or “star-worthy”.
But is this standard of “star-worthy” restaurants the same for all Michelin Star restaurants in 2019?
cuisine_country <- allstar %>% group_by(country) %>% count(cuisine) %>% top_n(1)
## Selecting by n
cuisine_country %>% arrange(desc(cuisine))
## # A tibble: 27 x 3
## # Groups: country [20]
## country cuisine n
## <chr> <chr> <int>
## 1 Thailand Thai 10
## 2 Singapore Sushi 5
## 3 Taiwan Sushi 4
## 4 Greece Seafood 1
## 5 Greece Mediterranean 1
## 6 South Korea Korean 8
## 7 Brazil Japanese 6
## 8 Singapore Innovative 5
## 9 Taiwan Innovative 4
## 10 Singapore French contemporary 5
## # … with 17 more rows
cuisine_country$cuisine %>%
unique() %>%
length()
## [1] 12
cuisine_map <- world_map %>%
left_join(cuisine_country, by = c("region" = "country"))
ggplot() +
geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines by Country") +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank())
##Europe
ggplot() +
geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines in Europe") +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank()) + scale_x_continuous(limits = c(-25, 50)) +
scale_y_continuous(limits = c(25, 75))
##Asia
ggplot() +
geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines in Asia") +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank()) + scale_x_continuous(limits = c(50, 150)) +
scale_y_continuous(limits = c(0, 75))
We can see that though “Contemporary” restaurants seem to dominate when we just looked at all restaurants as a whole. However, when looking at specific countries, we can see that there is more variety in the cuisines that each country prefers. For some countries like Thailand, its unsurprising to see that Thai food is the most common. This sort of negates our “star-worthy” restaurants theory.
Now, we want to know if more stars makes a restaurant more expensive. Let’s take a look.
##first, lets do some data cleaning. There is no price data available for restaurants in the UK and Ireland, so for the sake of this plot, lets remove them as variables.
allstar_clean <- allstar %>% filter(country != "UK" & country != "Ireland" )
##fIn addition, prices are denoted in dollar signs, which may get confusing. Lets change them to "price scores" based on the number of dollar signs
allstar_clean <- allstar_clean %>% mutate(price_score = price)
allstar_clean$price_score[allstar_clean$price_score=="$$$$$"] <- 5
allstar_clean$price_score[allstar_clean$price_score=="$$$$"] <- 4
allstar_clean$price_score[allstar_clean$price_score=="$$$"] <- 3
allstar_clean$price_score[allstar_clean$price_score=="$$"] <- 2
allstar_clean$price_score[allstar_clean$price_score=="$"] <- 1
allstar_clean$price_score <- as.numeric(allstar_clean$price_score)
allstar_clean$star <- as.factor(allstar_clean$star)
ggplot(data = allstar_clean, aes( x = star, y = price_score)) +
geom_violin() +
geom_point(size = 2, position="jitter") +
ggtitle("Price Scores vs. Star Count for 2019 Michelin Star Restaurants")
Very Interesting. There is a wide spread of price points for one star restaurants, however, two star and three star restaurants tend to be more expensive on average and there is much less spread.
I must note again, that the conclusions I will be drawing are for this dataset only. I have no way of knowing how this data is related to past data on Michelin Star restaurants or data on them as a whole. Originally, my project was meant to look at Michelin Star restaurants as a whole. However, I later discovered that the dataset I was looking at only covered restaurants that were awarded in 2019. I could not locate other data on past years.
From this dataset, we see that the US and the UK dominate in Michelin Star restaurants awarded in 2019, and this is what may have colored our perception of what cuisines are most preferred globally. Because we saw that “Contemporary” restaurants were, by far, the most popular in 2019, it was surprising to see that there was a great variety of cuisines preferred by each country. This is less surprising when you consider that the US leads in Michelin Stars awarded, and that in the US, “Contemporary” restaurants were the most popular. Damian had suggested that I look at what cuisines were preferred by each region, and I’m really glad he did. Without looking at each individual country, we would’ve had the perception that “Contemporary” restaurants were preferred generally in 2019. However, looking at specific countries, we see that this is not necessarily the case, and if we were to study cuisines Michelin Star restaurants further, we should do so on a country-to-country basis.
Lastly, when we look at price, we see that, in general, three- and two- star restaurants tended to be more expensive in 2019, while there is a great variety in prices for one star restaurants. I chose to do violin plots for this data instead of the proposed line-of-best-fit because, upon seeing the data, I felt like it was more important to represent the spread of prices.