Original


Source: (Pereira, 2020)


Objective

Ramen(Noodles) is one of the most popular food around the world and it is also famous for its variety of dishes and different types of styles used in different regions.Considering the love for ramen by the audience the data visualization is obtained for the popularity of the ramen restaurants with highest number of 5-star ratings around 23 countries.

The data is focussed globally to encourage a healthy competition among the hospitality sector for each country to get added in the 5-star rated restaurants, it also attracts the people who love food blogging as well as people who would love to visit and taste the different styles and variety when they travel overseas.

The visualisation chosen had the following three main issues:

*PIE CHART

The first and the biggest problem with the original data visualization is the Pie Chart, there are so many pieces of data due to which it becomes less effective. The Pie charts uses angles to represent the proportions due to which audience are compelled to compare each portion constantly which becomes a tedious task. Lastly, there are so many portions in the pie chart which makes it more difficult to extract the key information and results in low accuracy for numeric data which means that we cannot reveal the exact values easily.

For example : It can be seen that after UK all the countries are so much saturated on the chart that we cannot extract the values and compare them the Pie chart becomes difficult to find the trend with the increase in number of parameters.

This shows that use of Pie chart for this data visualization is not a correct choice as it has vast data due to which most of the data looks similar and exact values cannot be extracted.

*AREA AND SIZE ISSUE

The pie chart has multiple portions which looks similar in terms of area and puts the audience in the dilemma to identify the main trend / pattern. It becomes difficult while comparing multiple countries as we cannot easily notice any good results. Also it has couple of very small portions which cannot be even differentiated.

For example : It can be seen that Hong Kong and Thailand seems to have almost same portion of area but it is very difficult to distinguish between them. Also finding a trend among any countries is difficult as we don’t know the exact values we cannot predict the ratio for japan & USA or SINGAPORE & TAIWAN .

This results in poor outcomes as we cannot find a trend / pattern or ratios to justify some predictions.

The Pie chart has so many parameters due to which it became necessity to use more colors to cover all different countries. However, this resulted in poor aesthetic and appearance for the Pie chart. Due to the overdose of colors and too much of data used it is difficult to understand the data easily. Due to which focusing on the main part and the audience are distracted from the message which in intended to be given.

For Example : As there are 23 countries around a pie chart it becomes difficult to manage all and causes to mis lead the information and also excess colors makes it difficult to visualize.

As there are so many pieces of data there is visual bombardment on the pie chart which leads to give an incorrect information and excess use of color gives an ugly appearance.

Reference

*Pereira, I. (2020). Data Visualization | Purkinje Delirium. Retrieved 21 September 2020, from https://inespereira.com/post/data-visualization/

Code

The following code was used to fix the issues identified in the original.

#packages
library(readr)
library(ggplot2)
library(dplyr)
library(tidyverse)

#Importing dataset
ramen <- read_csv("ramen-ratings.csv")
ramen$Stars <- as.numeric(ramen$Stars)

# Creating new variable with reqired data
five <-ramen %>%select(Country,Stars)%>%filter(Stars==5)%>%group_by(Country)%>%summarise(count=n())
five
## # A tibble: 23 x 2
##    Country   count
##    <chr>     <int>
##  1 Australia     1
##  2 Brazil        1
##  3 Cambodia      2
##  4 Canada        2
##  5 China        12
##  6 Germany       1
##  7 Hong Kong    22
##  8 India         2
##  9 Indonesia    23
## 10 Japan        74
## # ... with 13 more rows
five$Country <- five$Country %>% factor(levels = five$Country[order(five$count)])

p1 <- ggplot(five,aes(x=Country,y=count)) +
geom_bar(stat = "identity", fill = "#FFAAAA", alpha=.6, width=0.9) +
coord_flip() +
theme_bw() +
labs(title = "Popularity of ramen in different countries",
subtitle = "(5-star rating)",
y = "Number of 5-star rating restaurants",
x = "Countries") +
geom_text(aes(label=round(count,2)), hjust = -0.5,size = 3)+
theme(plot.title = element_text(size = 14,face = "bold"),
      axis.title = element_text(size= 10,face="bold"),
      axis.text = element_text(size = 8,face = "bold")
      )

Data Reference

Reconstruction

The following plot fixes the main issues in the original.

In reconstruction, Bar graph is produced to display the vast data in a significant and systematic manner which can efficiently reveal the values for the comparison.The exact values gives a clear idea for the comparison and pattern which can be used to analyze.Lastly, due to the use of Bar graph the visual bombardment are reduced with sufficient and looks more decent and informative.