Introduction

For this project, I am looking at this dataset I found on Kaggle (https://www.kaggle.com/jackywang529/michelin-restaurants). It features Michelin Star restaurants award stars in the year 2019. I wanted to see what insights could be drawn from this data, specifically, how diverse are the cuisines of these restaurants? In what countries did most restaurants get stars that year? Which restaurants are the most expensive and how is price correlated with the number of stars for these restaurants? To carry out my analysis, I will ask three questions:

  1. What are the hottest Michelin Star countries in 2019?
  2. The most common cuisines in 1, 2, 3 star Michelin Restaurants in 2019? What are the most common cuisines by country?
  3. Is there a positive correlation between stars and price?

For this project, I have to disclaim that I only have available restaurants that were awarded stars in 2019, and thus I cannot compare them with restaurants that have been awarded in the past. In addition, past restaurants will not be featured in the data I will be looking at today. Due to the narrow scope of this dataset, the insight I derive from my analysis may be limited, and I will not draw lofty conclusions about all Michelin Star restaurants, only the ones that were awarded in 2019.

Data Analysis

Lets start by loading the necessary packages.

library(tidyverse)
library(ggplot2)
library(dplyr)
library(maps)

Now lets load the data.

 onestar <- read.csv("/Users/binhnguyen/Desktop/Stats32/one-star-michelin-restaurants.csv")
 twostar <- read.csv("/Users/binhnguyen/Desktop/Stats32/two-stars-michelin-restaurants.csv")
 threestar <- read.csv("/Users/binhnguyen/Desktop/Stats32/three-stars-michelin-restaurants.csv")

Lets look over our data.

head(onestar)
##            name year latitude longitude           city  region zipCode
## 1  Kilian Stuba 2019 47.34858  10.17114 Kleinwalsertal Austria   87568
## 2 Pfefferschiff 2019 47.83787  13.07917       Hallwang Austria    5300
## 3     Esszimmer 2019 47.80685  13.03409       Salzburg Austria    5020
## 4    Carpe Diem 2019 47.80001  13.04006       Salzburg Austria    5020
## 5        Edvard 2019 48.21650  16.36852           Wien Austria    1010
## 6      Das Loft 2019 48.21272  16.37931           Wien Austria    1020
##           cuisine price
## 1        Creative $$$$$
## 2 Classic cuisine $$$$$
## 3        Creative $$$$$
## 4  Market cuisine $$$$$
## 5  Modern cuisine  $$$$
## 6  Modern cuisine $$$$$
##                                                                                  url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3     https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4    https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5                     https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6                   https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
head(twostar)
##                               name year latitude longitude     city  region
## 1                 SENNS.Restaurant 2019 47.83636  13.06389 Salzburg Austria
## 2                           Ikarus 2019 47.79536  13.00695 Salzburg Austria
## 3                      Mraz & Sohn 2019 48.23129  16.37637     Wien Austria
## 4              Konstantin Filippou 2019 48.21056  16.37996     Wien Austria
## 5 Silvio Nickol Gourmet Restaurant 2019 48.20558  16.37693     Wien Austria
## 6          Steirereck im Stadtpark 2019 48.20229  16.38050     Wien Austria
##   zipCode        cuisine price
## 1    5020       Creative $$$$$
## 2    5020       Creative $$$$$
## 3    1200       Creative $$$$$
## 4    1010 Modern cuisine $$$$$
## 5    1010 Modern cuisine $$$$$
## 6    1030       Creative $$$$$
##                                                                                        url
## 1    https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/senns-restaurant
## 2              https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/ikarus
## 3                        https://guide.michelin.com/at/en/vienna/wien/restaurant/mraz-sohn
## 4              https://guide.michelin.com/at/en/vienna/wien/restaurant/konstantin-filippou
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/silvio-nickol-gourmet-restaurant
## 6          https://guide.michelin.com/at/en/vienna/wien/restaurant/steirereck-im-stadtpark
head(threestar)
##                 name year latitude  longitude                city     region
## 1             Amador 2019 48.25406   16.35915                Wien    Austria
## 2            Manresa 2019 37.22761 -121.98071 South San Francisco California
## 3               Benu 2019 37.78521 -122.39876       San Francisco California
## 4             Quince 2019 37.79762 -122.40337       San Francisco California
## 5      Atelier Crenn 2019 37.79835 -122.43586       San Francisco California
## 6 The French Laundry 2019 38.40443 -122.36474       San Francisco California
##   zipCode      cuisine price
## 1    1190     Creative $$$$$
## 2   95030 Contemporary  $$$$
## 3   94105        Asian  $$$$
## 4   94133 Contemporary  $$$$
## 5   94123 Contemporary  $$$$
## 6   94599 Contemporary  $$$$
##                                                                                       url
## 1                          https://guide.michelin.com/at/en/vienna/wien/restaurant/amador
## 2      https://guide.michelin.com/us/en/california/south-san-francisco/restaurant/manresa
## 3               https://guide.michelin.com/us/en/california/san-francisco/restaurant/benu
## 4             https://guide.michelin.com/us/en/california/san-francisco/restaurant/quince
## 5      https://guide.michelin.com/us/en/california/san-francisco/restaurant/atelier-crenn
## 6 https://guide.michelin.com/us/en/california/san-francisco/restaurant/the-french-laundry

Looks good, but these are three separate dataframes. We want to combine them into one. But first, we want to add an extra column to each one to indicate the number of stars.

onestar$star <- 1
twostar$star <- 2
threestar$star <- 3

Lets see what they look like now.

head(onestar)
##            name year latitude longitude           city  region zipCode
## 1  Kilian Stuba 2019 47.34858  10.17114 Kleinwalsertal Austria   87568
## 2 Pfefferschiff 2019 47.83787  13.07917       Hallwang Austria    5300
## 3     Esszimmer 2019 47.80685  13.03409       Salzburg Austria    5020
## 4    Carpe Diem 2019 47.80001  13.04006       Salzburg Austria    5020
## 5        Edvard 2019 48.21650  16.36852           Wien Austria    1010
## 6      Das Loft 2019 48.21272  16.37931           Wien Austria    1020
##           cuisine price
## 1        Creative $$$$$
## 2 Classic cuisine $$$$$
## 3        Creative $$$$$
## 4  Market cuisine $$$$$
## 5  Modern cuisine  $$$$
## 6  Modern cuisine $$$$$
##                                                                                  url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3     https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4    https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5                     https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6                   https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
##   star
## 1    1
## 2    1
## 3    1
## 4    1
## 5    1
## 6    1
head(twostar)
##                               name year latitude longitude     city  region
## 1                 SENNS.Restaurant 2019 47.83636  13.06389 Salzburg Austria
## 2                           Ikarus 2019 47.79536  13.00695 Salzburg Austria
## 3                      Mraz & Sohn 2019 48.23129  16.37637     Wien Austria
## 4              Konstantin Filippou 2019 48.21056  16.37996     Wien Austria
## 5 Silvio Nickol Gourmet Restaurant 2019 48.20558  16.37693     Wien Austria
## 6          Steirereck im Stadtpark 2019 48.20229  16.38050     Wien Austria
##   zipCode        cuisine price
## 1    5020       Creative $$$$$
## 2    5020       Creative $$$$$
## 3    1200       Creative $$$$$
## 4    1010 Modern cuisine $$$$$
## 5    1010 Modern cuisine $$$$$
## 6    1030       Creative $$$$$
##                                                                                        url
## 1    https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/senns-restaurant
## 2              https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/ikarus
## 3                        https://guide.michelin.com/at/en/vienna/wien/restaurant/mraz-sohn
## 4              https://guide.michelin.com/at/en/vienna/wien/restaurant/konstantin-filippou
## 5 https://guide.michelin.com/at/en/vienna/wien/restaurant/silvio-nickol-gourmet-restaurant
## 6          https://guide.michelin.com/at/en/vienna/wien/restaurant/steirereck-im-stadtpark
##   star
## 1    2
## 2    2
## 3    2
## 4    2
## 5    2
## 6    2
head(threestar)
##                 name year latitude  longitude                city     region
## 1             Amador 2019 48.25406   16.35915                Wien    Austria
## 2            Manresa 2019 37.22761 -121.98071 South San Francisco California
## 3               Benu 2019 37.78521 -122.39876       San Francisco California
## 4             Quince 2019 37.79762 -122.40337       San Francisco California
## 5      Atelier Crenn 2019 37.79835 -122.43586       San Francisco California
## 6 The French Laundry 2019 38.40443 -122.36474       San Francisco California
##   zipCode      cuisine price
## 1    1190     Creative $$$$$
## 2   95030 Contemporary  $$$$
## 3   94105        Asian  $$$$
## 4   94133 Contemporary  $$$$
## 5   94123 Contemporary  $$$$
## 6   94599 Contemporary  $$$$
##                                                                                       url
## 1                          https://guide.michelin.com/at/en/vienna/wien/restaurant/amador
## 2      https://guide.michelin.com/us/en/california/south-san-francisco/restaurant/manresa
## 3               https://guide.michelin.com/us/en/california/san-francisco/restaurant/benu
## 4             https://guide.michelin.com/us/en/california/san-francisco/restaurant/quince
## 5      https://guide.michelin.com/us/en/california/san-francisco/restaurant/atelier-crenn
## 6 https://guide.michelin.com/us/en/california/san-francisco/restaurant/the-french-laundry
##   star
## 1    3
## 2    3
## 3    3
## 4    3
## 5    3
## 6    3

Now its time to combine the datasets.

allstar <- rbind(onestar, twostar, threestar)

Unsurprisingly, we can see that there are way more three and two star restaurants than one star.

nrow(allstar)
## [1] 695
nrow(subset(allstar, star==1))
## [1] 549
nrow(subset(allstar, star==2))
## [1] 110
nrow(subset(allstar, star==3))
## [1] 36

How do the spread of one, two, and three star restaurants vary between countries? Let’s find out.

First, we want to sort our data by countries. However, our data is sorted into “regions”, and in the US, those “regions” are states and sometimes even cities. We want all of that to be just the US. In addition, some big foreign cities are also listed as regions, we want to replace those with it’s country’s name.

head(allstar, region=California)
##            name year latitude longitude           city  region zipCode
## 1  Kilian Stuba 2019 47.34858  10.17114 Kleinwalsertal Austria   87568
## 2 Pfefferschiff 2019 47.83787  13.07917       Hallwang Austria    5300
## 3     Esszimmer 2019 47.80685  13.03409       Salzburg Austria    5020
## 4    Carpe Diem 2019 47.80001  13.04006       Salzburg Austria    5020
## 5        Edvard 2019 48.21650  16.36852           Wien Austria    1010
## 6      Das Loft 2019 48.21272  16.37931           Wien Austria    1020
##           cuisine price
## 1        Creative $$$$$
## 2 Classic cuisine $$$$$
## 3        Creative $$$$$
## 4  Market cuisine $$$$$
## 5  Modern cuisine  $$$$
## 6  Modern cuisine $$$$$
##                                                                                  url
## 1 https://guide.michelin.com/at/en/vorarlberg/kleinwalsertal/restaurant/kilian-stuba
## 2 https://guide.michelin.com/at/en/salzburg-region/hallwang/restaurant/pfefferschiff
## 3     https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/esszimmer
## 4    https://guide.michelin.com/at/en/salzburg-region/salzburg/restaurant/carpe-diem
## 5                     https://guide.michelin.com/at/en/vienna/wien/restaurant/edvard
## 6                   https://guide.michelin.com/at/en/vienna/wien/restaurant/das-loft
##   star
## 1    1
## 2    1
## 3    1
## 4    1
## 5    1
## 6    1

So now, we’re going to create a new column called ‘countries’ in which we mush all of the states into one.

allstar <- allstar %>% mutate(country = region)
allstar$country[allstar$country=="Alabama"] <- "USA"
allstar$country[allstar$country=="Alaska"] <- "USA"
allstar$country[allstar$country=="Arizona"] <- "USA"
allstar$country[allstar$country=="Arkansas"] <- "USA"
allstar$country[allstar$country=="California"] <- "USA"
allstar$country[allstar$country=="Colorado"] <- "USA"
allstar$country[allstar$country=="Connecticut"] <- "USA"
allstar$country[allstar$country=="Delaware"] <- "USA"
allstar$country[allstar$country=="Florida"] <- "USA"
allstar$country[allstar$country=="Georgia"] <- "USA"
allstar$country[allstar$country=="Hawaii"] <- "USA"
allstar$country[allstar$country=="Idaho"] <- "USA"
allstar$country[allstar$country=="Illinois"] <- "USA"
allstar$country[allstar$country=="Indiana"] <-"USA"
allstar$country[allstar$country=="Iowa"] <- "USA"
allstar$country[allstar$country=="Kansas"] <- "USA"
allstar$country[allstar$country=="Kentucky"] <- "USA"
allstar$country[allstar$country=="Louisiana"] <- "USA"
allstar$country[allstar$country=="Maine"] <- "USA"
allstar$country[allstar$country=="Maryland"] <- "USA"
allstar$country[allstar$country=="Massachusetts"] <- "USA"
allstar$country[allstar$country=="Michigan"] <- "USA"
allstar$country[allstar$country=="Minnesota"] <- "USA"
allstar$country[allstar$country=="Mississippi"] <- "USA"
allstar$country[allstar$country=="Missouri"] <- "USA"
allstar$country[allstar$country=="Montana"] <- "USA"
allstar$country[allstar$country=="Nebraska"] <- "USA"
allstar$country[allstar$country=="Nevada"] <- "USA"
allstar$country[allstar$country=="New Hampshire"] <- "USA"
allstar$country[allstar$country=="New Jersey"] <- "USA"
allstar$country[allstar$country=="New Mexico"] <- "USA"
allstar$country[allstar$country=="New York"] <- "USA"
allstar$country[allstar$country=="North Carolina"] <- "USA"
allstar$country[allstar$country=="North Dakota"] <- "USA"
allstar$country[allstar$country=="Ohio"] <- "USA"
allstar$country[allstar$country=="Oklahoma"] <- "USA"
allstar$country[allstar$country=="Oregon"] <- "USA"
allstar$country[allstar$country=="Pennsylvania"] <- "USA"
allstar$country[allstar$country=="Rhode Island"] <- "USA"
allstar$country[allstar$country=="South Carolina"] <- "USA"
allstar$country[allstar$country=="South Dakota"] <- "USA"
allstar$country[allstar$country=="Tennessee"] <- "USA"
allstar$country[allstar$country=="Texas"] <- "USA"
allstar$country[allstar$country=="Utah"] <- "USA"
allstar$country[allstar$country=="Vermont"] <- "USA"
allstar$country[allstar$country=="Virginia"] <-"USA"
allstar$country[allstar$country=="Washington"] <- "USA"
allstar$country[allstar$country=="West Virginia"] <- "USA"
allstar$country[allstar$country=="Wisconsin"] <- "USA"
allstar$country[allstar$country=="Wyoming"] <- "USA"
allstar$country[allstar$country=="Delaware"] <- "USA"
allstar$country[allstar$country=="Washington DC"] <- "USA"
allstar$country[allstar$country=="New York City"] <- "USA"
allstar$country[allstar$country=="Chicago"] <- "USA"
allstar$country[allstar$country=="Taipei"] <- "Taiwan"
allstar$country[allstar$country=="Rio de Janeiro"] <- "Brazil"
allstar$country[allstar$country=="Sao Paulo"] <- "Brazil"
allstar$country[allstar$country=="Macau"] <- "China"
allstar$country[allstar$country=="United Kingdom"] <- "UK"

Now that we have all of that sorted out, lets take a look at our question #1.

Question 1: What are the Hottest Michelin Countries?

First, let’s look at which countries have the highest number of Michelin star restaurants.

allstar$country %>%
  unique() %>% 
  length()
## [1] 20
michelin_count <-
  allstar %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
michelin_count %>% arrange(desc(count))
## # A tibble: 20 x 2
##    country        count
##    <chr>          <int>
##  1 USA              202
##  2 UK               162
##  3 Hong Kong         63
##  4 Singapore         39
##  5 Denmark           28
##  6 South Korea       26
##  7 Thailand          26
##  8 Taiwan            24
##  9 Sweden            22
## 10 Austria           19
## 11 China             19
## 12 Brazil            18
## 13 Ireland           14
## 14 Norway             8
## 15 Finland            6
## 16 Hungary            6
## 17 Croatia            5
## 18 Greece             4
## 19 Czech Republic     2
## 20 Poland             2

It’s the US by a landslide! Now how about countries that have the highest number of three-star, two-star, and one-star Michelin restaurants.

#three stars
three_star <- allstar %>% filter(star ==3)

three_star$country %>%
  unique() %>% 
  length()
## [1] 10
three_count <-
 three_star  %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 10, wt = count)
three_count %>% arrange(desc(count))
## # A tibble: 10 x 2
##    country     count
##    <chr>       <int>
##  1 USA            14
##  2 Hong Kong       7
##  3 UK              5
##  4 China           3
##  5 South Korea     2
##  6 Austria         1
##  7 Denmark         1
##  8 Norway          1
##  9 Sweden          1
## 10 Taiwan          1
#two stars
two_star <- allstar %>% filter(star ==2)

two_star$country %>%
  unique() %>% 
  length()
## [1] 15
two_count <-
 two_star  %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 15, wt = count)
two_count %>% arrange(desc(count))
## # A tibble: 15 x 2
##    country     count
##    <chr>       <int>
##  1 USA            33
##  2 UK             19
##  3 Hong Kong      12
##  4 Austria         6
##  5 China           5
##  6 Denmark         5
##  7 Singapore       5
##  8 South Korea     5
##  9 Sweden          5
## 10 Taiwan          5
## 11 Thailand        4
## 12 Brazil          3
## 13 Greece          1
## 14 Hungary         1
## 15 Ireland         1
#one stars
one_star <- allstar %>% filter(star ==1)

one_star$country %>%
  unique() %>% 
  length()
## [1] 20
one_count <-
 one_star  %>% group_by(country) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
one_count %>% arrange(desc(count))
## # A tibble: 20 x 2
##    country        count
##    <chr>          <int>
##  1 USA              155
##  2 UK               138
##  3 Hong Kong         44
##  4 Singapore         34
##  5 Denmark           22
##  6 Thailand          22
##  7 South Korea       19
##  8 Taiwan            18
##  9 Sweden            16
## 10 Brazil            15
## 11 Ireland           13
## 12 Austria           12
## 13 China             11
## 14 Norway             7
## 15 Finland            6
## 16 Croatia            5
## 17 Hungary            5
## 18 Greece             3
## 19 Czech Republic     2
## 20 Poland             2

Now lets plot it.

world_map = map_data("world")

michelin_map <- world_map %>%
    left_join(michelin_count, by = c("region" = "country"))

ggplot() +
  geom_polygon(data=michelin_map, aes(x=long, y=lat, group=group, fill= count)) +
  scale_fill_distiller(palette = "YlOrBr", direction = 1) + 
  coord_quickmap() + labs(title = "2019 Michelin Star Restaurants by Country") +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank())

From this dataset, we can see that only 20 countries were awarded Michelin Stars in the year 2019. And it seems that the US and the UK takes the lead, while other countries trail much behind.

Question 2: The most common cuisines in 1, 2, 3 star Michelin Restaurants? What are the most common cuisines by country?

Next we want to look at what the diversity of cuisines in Michelin Star restaurants in 2019.

allstar$cuisine %>%
  unique() %>% 
  length()
## [1] 70
##just to get rid of some repetition in the data
allstar$cuisine[allstar$cuisine=="Modern cuisine"] <- "Contemporary"

cuisine_data <-
  allstar %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 20, wt = count)
cuisine_data %>% arrange(desc(count))
## # A tibble: 22 x 2
##    cuisine             count
##    <chr>               <int>
##  1 Contemporary          183
##  2 Japanese               54
##  3 Creative               46
##  4 Cantonese              40
##  5 Modern British         38
##  6 French                 29
##  7 Innovative             28
##  8 Italian                21
##  9 French contemporary    19
## 10 Sushi                  17
## # … with 12 more rows
#three stars
three_star$cuisine[three_star$cuisine=="Modern cuisine"] <- "Contemporary"
three_star$cuisine %>%
  unique() %>% 
  length()
## [1] 15
three_cuisine <-
 three_star  %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 16, wt = count)
three_cuisine %>% arrange(desc(count))
## # A tibble: 15 x 2
##    cuisine             count
##    <chr>               <int>
##  1 Contemporary           12
##  2 Cantonese               4
##  3 Creative                3
##  4 French                  3
##  5 French contemporary     2
##  6 Japanese                2
##  7 Korean                  2
##  8 American                1
##  9 Asian                   1
## 10 Chinese                 1
## 11 Classic French          1
## 12 Innovative              1
## 13 Italian                 1
## 14 Seafood                 1
## 15 Sushi                   1
#two stars
two_star$cuisine[two_star$cuisine=="Modern cuisine"] <- "Contemporary"
two_star$cuisine %>%
  unique() %>% 
  length()
## [1] 27
two_cuisine <-
 two_star  %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 28, wt = count)
two_cuisine %>% arrange(desc(count))
## # A tibble: 27 x 2
##    cuisine             count
##    <chr>               <int>
##  1 Contemporary           27
##  2 Creative               14
##  3 Japanese                8
##  4 French                  7
##  5 French contemporary     7
##  6 Innovative              6
##  7 Cantonese               5
##  8 Modern British          4
##  9 Sushi                   4
## 10 creative                3
## # … with 17 more rows
#one stars
one_star$cuisine[one_star$cuisine=="Modern cuisine"] <- "Contemporary"
one_star$cuisine %>%
  unique() %>% 
  length()
## [1] 66
one_cuisine <-
 one_star  %>% group_by(cuisine) %>% summarize(count = n()) %>% top_n(n = 67, wt = count)
one_cuisine %>% arrange(desc(count))
## # A tibble: 66 x 2
##    cuisine         count
##    <chr>           <int>
##  1 Contemporary      144
##  2 Japanese           44
##  3 Modern British     34
##  4 Cantonese          31
##  5 Creative           29
##  6 Innovative         21
##  7 French             19
##  8 Italian            19
##  9 Classic cuisine    14
## 10 Californian        13
## # … with 56 more rows

Now lets plot this data.

ggplot(data=cuisine_data, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="orange") + 
labs(title="Most Popular Cuisine in Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") + 
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))

“Contemporary” restaurants seem to be most common within Michelin Star restaurants awarded in 2019. How does this differ by star count?

Lets look at data for three, two, and one star.

##Three stars
ggplot(data=three_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="pink") + 
labs(title="Most Popular Cuisine in THREE STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") + 
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))

##two stars
ggplot(data=two_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="light blue") + 
labs(title="Most Popular Cuisine in TWO STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") + 
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))

##one star
ggplot(data=one_cuisine, aes(x=reorder(cuisine, -count), y=count)) + geom_bar(stat="identity", fill="light green") + 
labs(title="Most Popular Cuisine in ONE STAR Michelin Star Restaurants awarded in 2019", x="Cuisine", y="Cuisine Frequency") + 
theme(axis.text.x = element_text(angle=90), text = element_text(size = 10))

We can see here that “Contemporary” restaurants remain the most common at all three star counts. However, its not surprising that the selection for three stars in much smaller. The selection of restaurants for two star restaurants is more diverse, but we can see that there is a great variety of one star restaurants serving diverse cuisines. Contemporary remains, by far, the most common, but there is a long tail in which we can see a great variety of other cuisines. This could mean that the standards for getting one Michelin star are much less rigorous, and that three star restaurants occupy a certain niche in of itself wherein certain cuisines may be considered more prestigious or “star-worthy”.

But is this standard of “star-worthy” restaurants the same for all Michelin Star restaurants in 2019?

cuisine_country <- allstar %>% group_by(country) %>% count(cuisine) %>% top_n(1)  
## Selecting by n
cuisine_country %>% arrange(desc(cuisine)) 
## # A tibble: 27 x 3
## # Groups:   country [20]
##    country     cuisine                 n
##    <chr>       <chr>               <int>
##  1 Thailand    Thai                   10
##  2 Singapore   Sushi                   5
##  3 Taiwan      Sushi                   4
##  4 Greece      Seafood                 1
##  5 Greece      Mediterranean           1
##  6 South Korea Korean                  8
##  7 Brazil      Japanese                6
##  8 Singapore   Innovative              5
##  9 Taiwan      Innovative              4
## 10 Singapore   French contemporary     5
## # … with 17 more rows
cuisine_country$cuisine %>%
  unique() %>% 
  length()
## [1] 12
cuisine_map <- world_map %>%
    left_join(cuisine_country, by = c("region" = "country"))

ggplot() +
  geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
  coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines by Country") +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank()) 

##Europe
ggplot() +
  geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
  coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines in Europe") +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank()) + scale_x_continuous(limits = c(-25, 50)) +
scale_y_continuous(limits = c(25, 75))

##Asia
ggplot() +
  geom_polygon(data=cuisine_map, aes(x=long, y=lat, group=group, fill=cuisine)) +
  coord_quickmap() + labs(title = "2019 Michelin Star Top Cuisines in Asia") +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), rect = element_blank()) + scale_x_continuous(limits = c(50, 150)) +
scale_y_continuous(limits = c(0, 75))

We can see that though “Contemporary” restaurants seem to dominate when we just looked at all restaurants as a whole. However, when looking at specific countries, we can see that there is more variety in the cuisines that each country prefers. For some countries like Thailand, its unsurprising to see that Thai food is the most common. This sort of negates our “star-worthy” restaurants theory.

Question 3: Is there a positive correlation between stars and price?

Now, we want to know if more stars makes a restaurant more expensive. Let’s take a look.

##first, lets do some data cleaning. There is no price data available for restaurants in the UK and Ireland, so for the sake of this plot, lets remove them as variables.
allstar_clean <- allstar %>% filter(country != "UK" & country != "Ireland" )

##fIn addition, prices are denoted in dollar signs, which may get confusing. Lets change them to "price scores" based on the number of dollar signs
allstar_clean <- allstar_clean %>% mutate(price_score = price)
allstar_clean$price_score[allstar_clean$price_score=="$$$$$"] <- 5
allstar_clean$price_score[allstar_clean$price_score=="$$$$"] <- 4
allstar_clean$price_score[allstar_clean$price_score=="$$$"] <- 3
allstar_clean$price_score[allstar_clean$price_score=="$$"] <- 2
allstar_clean$price_score[allstar_clean$price_score=="$"] <- 1

allstar_clean$price_score <- as.numeric(allstar_clean$price_score)
allstar_clean$star <- as.factor(allstar_clean$star)

ggplot(data = allstar_clean, aes( x = star, y = price_score)) + 
    geom_violin() + 
    geom_point(size = 2, position="jitter") + 
    ggtitle("Price Scores vs. Star Count for 2019 Michelin Star Restaurants") 

Very Interesting. There is a wide spread of price points for one star restaurants, however, two star and three star restaurants tend to be more expensive on average and there is much less spread.

Conclusion

I must note again, that the conclusions I will be drawing are for this dataset only. I have no way of knowing how this data is related to past data on Michelin Star restaurants or data on them as a whole. Originally, my project was meant to look at Michelin Star restaurants as a whole. However, I later discovered that the dataset I was looking at only covered restaurants that were awarded in 2019. I could not locate other data on past years.

From this dataset, we see that the US and the UK dominate in Michelin Star restaurants awarded in 2019, and this is what may have colored our perception of what cuisines are most preferred globally. Because we saw that “Contemporary” restaurants were, by far, the most popular in 2019, it was surprising to see that there was a great variety of cuisines preferred by each country. This is less surprising when you consider that the US leads in Michelin Stars awarded, and that in the US, “Contemporary” restaurants were the most popular. Damian had suggested that I look at what cuisines were preferred by each region, and I’m really glad he did. Without looking at each individual country, we would’ve had the perception that “Contemporary” restaurants were preferred generally in 2019. However, looking at specific countries, we see that this is not necessarily the case, and if we were to study cuisines Michelin Star restaurants further, we should do so on a country-to-country basis.

Lastly, when we look at price, we see that, in general, three- and two- star restaurants tended to be more expensive in 2019, while there is a great variety in prices for one star restaurants. I chose to do violin plots for this data instead of the proposed line-of-best-fit because, upon seeing the data, I felt like it was more important to represent the spread of prices.