Is a package in ggplot
that allows us to visualize our data and make it clearer when presenting our findings. In this document I plan to show how to use ggplot features that are not just bar plots or line charts. The data I will be using to demo the different features of ggplot is one that rates Ramen. It allows me to show how ggplot graphs discrete and continuous variables.
ramen_ratings <- read.csv("https://raw.githubusercontent.com/moiyajosephs/Data607-Project2/main/ramen-ratings.csv")
When plotting two discrete variables geom_count
is recommended. Geom_Count
is a variation of geom_point
and maps the frequency for each observation. It has a legend for the dots, the larger the dot the more of that value there is in the data.
ggplot(ramen_ratings, aes(Brand, Style, color = Style)) + geom_count() + theme(axis.text.x = element_blank() )
Geom col
is identical to geom_bar since it shows bar charts. The difference with geom_col
is that it allows you to plot the bars relative to the data instead of the number of types x occurs like geom_bar
.
Below I am able to plot the brand of ramen and the ratings they received.
ramen_ratings$Stars <- as.numeric(ramen_ratings$Stars)
## Warning: NAs introduced by coercion
ggplot(head(ramen_ratings,10), aes(Brand, Stars, fill = Style) ) + geom_col() + theme(axis.text.x = element_text(angle = -30, vjust = 1, hjust = 0))
This is the most interesting of the data and if data has regions specified, you can map where each point is. The function map_data
allows you to get the longitude and latitude of a region specified, like state, or country. The ramen data is international, luckily we can also set map_data
to world, so it collects all the coordinates of countries in the world.
map <- map_data("world")
This data is very large, however, and we do not need all the coordinates for the same country. So I used the distinct
function in order to get unique row values for each country in map_data
and call it map_regions.set
.
map.regions.set <- distinct(map, region, .keep_all = TRUE)
Now that I have the unique regions, I can left join it with the map data where the country equals the region. That way we have the ramen information from the original dataset, joined with the coordinates for the region of each plot,
ramen.map <- left_join(ramen_ratings, map.regions.set, by = c("Country"="region"))
ggplot() +
geom_map(
data = map, map = map,
aes(long, lat, map_id = region),
color = "white", fill = "lightgray", size = 0.1
) +
geom_point(
data = ramen.map,
aes(long, lat, color = Style),
alpha = 0.7
)
## Warning: Ignoring unknown aesthetics: x, y
## Warning: Removed 148 rows containing missing values (geom_point).
The map above shows where the style of ramens are located around the world. At a glance, a person could see that pack is a popular ramen style.
Ggplot
is a very powerful library within tidyverse that allows you to make various visualizations based on your data. With visualizations, data scientists can present any key findings in an easy to understand way.
list(title = “Welcome to the {tidyverse}”, author = list(list(given = “Hadley”, family = “Wickham”, role = NULL, email = NULL, comment = NULL), list(given = “Mara”, family = “Averick”, role = NULL, email = NULL, comment = NULL), list(given = “Jennifer”, family = “Bryan”, role = NULL, email = NULL, comment = NULL), list(given = “Winston”, family = “Chang”, role = NULL, email = NULL, comment = NULL), list(given = c(“Lucy”, “D’Agostino”), family = “McGowan”, role = NULL, email = NULL, comment = NULL), list(given = “Romain”, family = “François”, role = NULL, email = NULL, comment = NULL), list(given = “Garrett”, family = “Grolemund”, role = NULL, email = NULL, comment = NULL), list(given = “Alex”, family = “Hayes”, role = NULL, email = NULL, comment = NULL), list(given = “Lionel”, family = “Henry”, role = NULL, email = NULL, comment = NULL), list(given = “Jim”, family = “Hester”, role = NULL, email = NULL, comment = NULL), list(given = “Max”, family = “Kuhn”, role = NULL, email = NULL, comment = NULL), list(given = c(“Thomas”, “Lin”), family = “Pedersen”, role = NULL, email = NULL, comment = NULL), list(given = “Evan”, family = “Miller”, role = NULL, email = NULL, comment = NULL), list(given = c(“Stephan”, “Milton”), family = “Bache”, role = NULL, email = NULL, comment = NULL), list(given = “Kirill”, family = “Müller”, role = NULL, email = NULL, comment = NULL), list(given = “Jeroen”, family = “Ooms”, role = NULL, email = NULL, comment = NULL), list(given = “David”, family = “Robinson”, role = NULL, email = NULL, comment = NULL), list(given = c(“Dana”, “Paige”), family = “Seidel”, role = NULL, email = NULL, comment = NULL), list(given = “Vitalie”, family = “Spinu”, role = NULL, email = NULL, comment = NULL), list(given = “Kohske”, family = “Takahashi”, role = NULL, email = NULL, comment = NULL), list(given = “Davis”, family = “Vaughan”, role = NULL, email = NULL, comment = NULL), list(given = “Claus”, family = “Wilke”, role = NULL, email = NULL, comment = NULL), list(given = “Kara”, family = “Woo”, role = NULL, email = NULL, comment = NULL), list(given = “Hiroaki”, family = “Yutani”, role = NULL, email = NULL, comment = NULL)), year = “2019”, journal = “Journal of Open Source Software”, volume = “4”, number = “43”, pages = “1686”, doi = “10.21105/joss.01686”)
https://www.kaggle.com/datasets/residentmario/ramen-ratings?resource=download
https://datavizpyr.com/how-to-make-world-map-with-ggplot2-in-r/
head(ramen_ratings)
## Review.. Brand
## 1 2580 New Touch
## 2 2579 Just Way
## 3 2578 Nissin
## 4 2577 Wei Lih
## 5 2576 Ching's Secret
## 6 2575 Samyang Foods
## Variety Style Country
## 1 T's Restaurant Tantanmen Cup Japan
## 2 Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles Pack Taiwan
## 3 Cup Noodles Chicken Vegetable Cup USA
## 4 GGE Ramen Snack Tomato Flavor Pack Taiwan
## 5 Singapore Curry Pack India
## 6 Kimchi song Song Ramen Pack South Korea
## Stars Top.Ten
## 1 3.75
## 2 1.00
## 3 2.25
## 4 2.75
## 5 3.75
## 6 4.75
ramen_ratings <- ramen_ratings %>%
rename(Reviews = Review..,
Ratings = Stars,
Top_Ten = Top.Ten
)
head(ramen_ratings)
## Reviews Brand
## 1 2580 New Touch
## 2 2579 Just Way
## 3 2578 Nissin
## 4 2577 Wei Lih
## 5 2576 Ching's Secret
## 6 2575 Samyang Foods
## Variety Style Country
## 1 T's Restaurant Tantanmen Cup Japan
## 2 Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles Pack Taiwan
## 3 Cup Noodles Chicken Vegetable Cup USA
## 4 GGE Ramen Snack Tomato Flavor Pack Taiwan
## 5 Singapore Curry Pack India
## 6 Kimchi song Song Ramen Pack South Korea
## Ratings Top_Ten
## 1 3.75
## 2 1.00
## 3 2.25
## 4 2.75
## 5 3.75
## 6 4.75
summary <- ramen_ratings %>%
group_by(Brand) %>%
summarise(Ratings = as.integer(mean(Ratings, na.rm = TRUE))) %>%
arrange(desc(Ratings)) %>% filter(Ratings == 5)
summary
## # A tibble: 24 x 2
## Brand Ratings
## <chr> <int>
## 1 ChoripDong 5
## 2 Daddy 5
## 3 Daifuku 5
## 4 Foodmon 5
## 5 Higashi 5
## 6 Jackpot Teriyaki 5
## 7 Kiki Noodle 5
## 8 Kimura 5
## 9 Komforte Chockolates 5
## 10 MyOri 5
## # ... with 14 more rows
Package ggplot2 - for Visualization
ggplot(summary, aes(x = Ratings, y = Brand, fill = Ratings)) +
geom_col(position = "dodge")
# glimpse(ramen_ratings)
top_ten_df <- filter(ramen_ratings, Top_Ten != "")
head(top_ten_df)
## Reviews Brand Variety Style
## 1 1964 MAMA Instant Noodles Coconut Milk Flavour Pack
## 2 1947 Prima Taste Singapore Laksa Wholegrain La Mian Pack
## 3 1925 Prima Juzz's Mee Creamy Chicken Flavour Pack
## 4 1907 Prima Taste Singapore Curry Wholegrain La Mian Pack
## 5 1828 Tseng Noodles Scallion With Sichuan Pepper Flavor Pack
## 6 1689 Wugudaochang Tomato Beef Brisket Flavor Purple Potato Noodle Pack
## Country Ratings Top_Ten
## 1 Myanmar 5 2016 #10
## 2 Singapore 5 2016 #1
## 3 Singapore 5 2016 #8
## 4 Singapore 5 2016 #5
## 5 Taiwan 5 2016 #9
## 6 China 5 2016 #7
top_ten_df <- top_ten_df %>% separate(Top_Ten, c("Year", "Ranking"))
head(top_ten_df)
## Reviews Brand Variety Style
## 1 1964 MAMA Instant Noodles Coconut Milk Flavour Pack
## 2 1947 Prima Taste Singapore Laksa Wholegrain La Mian Pack
## 3 1925 Prima Juzz's Mee Creamy Chicken Flavour Pack
## 4 1907 Prima Taste Singapore Curry Wholegrain La Mian Pack
## 5 1828 Tseng Noodles Scallion With Sichuan Pepper Flavor Pack
## 6 1689 Wugudaochang Tomato Beef Brisket Flavor Purple Potato Noodle Pack
## Country Ratings Year Ranking
## 1 Myanmar 5 2016 10
## 2 Singapore 5 2016 1
## 3 Singapore 5 2016 8
## 4 Singapore 5 2016 5
## 5 Taiwan 5 2016 9
## 6 China 5 2016 7
filter(top_ten_df, Ranking == 10)
## Reviews Brand
## 1 1964 MAMA
## 2 1638 A-Sha Dry Noodle
## 3 1471 Mama
## 4 1302 Mama
## 5 608 Koka
## Variety Style Country
## 1 Instant Noodles Coconut Milk Flavour Pack Myanmar
## 2 Veggie Noodle Tomato Noodle With Vine Ripened Tomato Sauce Pack Taiwan
## 3 Instant Noodles Shrimp Creamy Tom Yum Flavour Jumbo Pack Pack Thailand
## 4 Instant Noodles Yentafo Tom Yum Mohfai Flavour Pack Thailand
## 5 Spicy Black Pepper Pack Singapore
## Ratings Year Ranking
## 1 5 2016 10
## 2 5 2015 10
## 3 5 2013 10
## 4 5 2014 10
## 5 5 2012 10
ggplot(top_ten_df, aes(x = Ranking, y = Brand, fill = Ranking)) +
geom_col(position = "dodge")