The first step in building the chart of different alcohol types consumed by the USA, Seychelles, Iceland, and Greece is to import the raw data set from the GitHub website. I also use head() to see the first few rows of data and make sure the import looks okay.
library(knitr)
library(tidyverse)
library(tidyr)
library(dplyr)
library(ggplot2)
Drinks <- read_csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv") %>% rename(beer = beer_servings, spirit = spirit_servings, wine = wine_servings)
head(Drinks)
## # A tibble: 6 x 5
## country beer spirit wine total_litres_of_pure_alcohol
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan 0 0 0 0
## 2 Albania 89 132 54 4.9
## 3 Algeria 25 0 14 0.7
## 4 Andorra 245 138 312 12.4
## 5 Angola 217 57 45 5.9
## 6 Antigua & Barbuda 102 128 45 4.9
I then use the gather() function to convert the dataframe from wide to long. This also places the different types of alcohol into one column called “type.”
Drinks2 <- gather(Drinks, type, servings, beer, spirit, wine, -c(total_litres_of_pure_alcohol, country))
head(Drinks2)
## # A tibble: 6 x 4
## country total_litres_of_pure_alcohol type servings
## <chr> <dbl> <chr> <dbl>
## 1 Afghanistan 0 beer 0
## 2 Albania 4.9 beer 89
## 3 Algeria 0.7 beer 25
## 4 Andorra 12.4 beer 245
## 5 Angola 5.9 beer 217
## 6 Antigua & Barbuda 4.9 beer 102
I use the ggplot() function to replicate the chart of alcohol servings by country. The filter() function selects the countries of interest (USA, Seychelles, Iceland, and Greece). I also use geom_bar() function to ensure the bars’ correct positioning (side by side, not stacked).
ggplot(data = Drinks2 %>%
filter(country %in% c("USA", "Seychelles", "Iceland", "Greece")), aes(x = servings, y = country, fill = type)) +
geom_bar(stat="identity", position = "dodge")