Introduction

In the article “Dear Mona Followup: Where Do people Drink the Most Beer, Wine, ANd Sprits?” by Mona Chalabi, she wants to understand which country drinks most alcohol and what type of alcohol is the favorite. She diverse three kinds of alcohol ( Beer, Spirits, and Wine) to analyze the consumption per serving in each country.

Article from FiveThirtyEight: http://fivethirtyeight.com/datalab/dear-mona-followup-where-do-people-drink-the-most-beer-wine-and-spirits/ Dataset: https://github.com/fivethirtyeight/data/blob/master/alcohol-consumption/drinks.csv

alcohol_DF <- read.csv(file="https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv",header = TRUE, sep=",") 

summary(alcohol_DF)
##    country          beer_servings   spirit_servings  wine_servings   
##  Length:193         Min.   :  0.0   Min.   :  0.00   Min.   :  0.00  
##  Class :character   1st Qu.: 20.0   1st Qu.:  4.00   1st Qu.:  1.00  
##  Mode  :character   Median : 76.0   Median : 56.00   Median :  8.00  
##                     Mean   :106.2   Mean   : 80.99   Mean   : 49.45  
##                     3rd Qu.:188.0   3rd Qu.:128.00   3rd Qu.: 59.00  
##                     Max.   :376.0   Max.   :438.00   Max.   :370.00  
##  total_litres_of_pure_alcohol
##  Min.   : 0.000              
##  1st Qu.: 1.300              
##  Median : 4.200              
##  Mean   : 4.717              
##  3rd Qu.: 7.200              
##  Max.   :14.400
#add a new column to category low, moderate, and high consumption alcohol based on the 1st Qu. and 3rd Qu.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ stringr 1.4.0
## ✓ tidyr   1.1.4     ✓ forcats 0.5.1
## ✓ readr   2.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
New_alcohol_DF <- alcohol_DF %>%
  mutate(level_of_comsuption_alcohol = case_when(total_litres_of_pure_alcohol >=7.2 ~ "High",
                                                       total_litres_of_pure_alcohol >= 1.3 ~ "Moderate",
                                                       TRUE ~ "Low")
    
  )

New_alcohol_DF %>%
  slice(1:10)
##              country beer_servings spirit_servings wine_servings
## 1        Afghanistan             0               0             0
## 2            Albania            89             132            54
## 3            Algeria            25               0            14
## 4            Andorra           245             138           312
## 5             Angola           217              57            45
## 6  Antigua & Barbuda           102             128            45
## 7          Argentina           193              25           221
## 8            Armenia            21             179            11
## 9          Australia           261              72           212
## 10           Austria           279              75           191
##    total_litres_of_pure_alcohol level_of_comsuption_alcohol
## 1                           0.0                         Low
## 2                           4.9                    Moderate
## 3                           0.7                         Low
## 4                          12.4                        High
## 5                           5.9                    Moderate
## 6                           4.9                    Moderate
## 7                           8.3                        High
## 8                           3.8                    Moderate
## 9                          10.4                        High
## 10                          9.7                        High

Data Columns

high_consumption <- subset(New_alcohol_DF, level_of_comsuption_alcohol == "High")
high_consumption %>%
  select(c(1,5,6))%>%
  arrange(desc(total_litres_of_pure_alcohol))
##               country total_litres_of_pure_alcohol level_of_comsuption_alcohol
## 1             Belarus                         14.4                        High
## 2           Lithuania                         12.9                        High
## 3             Andorra                         12.4                        High
## 4             Grenada                         11.9                        High
## 5      Czech Republic                         11.8                        High
## 6              France                         11.8                        High
## 7  Russian Federation                         11.5                        High
## 8             Ireland                         11.4                        High
## 9          Luxembourg                         11.4                        High
## 10           Slovakia                         11.4                        High
## 11            Germany                         11.3                        High
## 12            Hungary                         11.3                        High
## 13           Portugal                         11.0                        High
## 14             Poland                         10.9                        High
## 15           Slovenia                         10.6                        High
## 16            Belgium                         10.5                        High
## 17             Latvia                         10.5                        High
## 18          Australia                         10.4                        High
## 19            Denmark                         10.4                        High
## 20            Romania                         10.4                        High
## 21     United Kingdom                         10.4                        High
## 22           Bulgaria                         10.3                        High
## 23            Croatia                         10.2                        High
## 24        Switzerland                         10.2                        High
## 25          St. Lucia                         10.1                        High
## 26            Finland                         10.0                        High
## 27              Spain                         10.0                        High
## 28        South Korea                          9.8                        High
## 29            Austria                          9.7                        High
## 30             Serbia                          9.6                        High
## 31            Estonia                          9.5                        High
## 32        Netherlands                          9.4                        High
## 33        New Zealand                          9.3                        High
## 34            Nigeria                          9.1                        High
## 35              Gabon                          8.9                        High
## 36            Ukraine                          8.9                        High
## 37                USA                          8.7                        High
## 38          Argentina                          8.3                        High
## 39             Greece                          8.3                        High
## 40             Uganda                          8.3                        High
## 41             Canada                          8.2                        High
## 42             Cyprus                          8.2                        High
## 43       South Africa                          8.2                        High
## 44  St. Kitts & Nevis                          7.7                        High
## 45          Venezuela                          7.7                        High
## 46              Chile                          7.6                        High
## 47           Paraguay                          7.3                        High
## 48             Brazil                          7.2                        High
## 49             Panama                          7.2                        High
## 50             Sweden                          7.2                        High

Conclusions

To extend Mona’s analysis, I found that a high consumption alcohol country most likely has its alcohol production. Also, the country’s drinking culture is a significant reason for high levels of consuming alcohol while all high levels of consumption alcohol counties had the non-Islamic country.

extend and verify

library(ggplot2)

New_alcohol_DF %>%
  ggplot()+
  geom_bar(aes(y = level_of_comsuption_alcohol, fill =level_of_comsuption_alcohol))+
  ggtitle("Number of countries in different Level of consumption alcohol")+
  xlab("Count")+
  ylab("")+ 
  theme(legend.position = "none")