In the article “Dear Mona Followup: Where Do people Drink the Most Beer, Wine, ANd Sprits?” by Mona Chalabi, she wants to understand which country drinks most alcohol and what type of alcohol is the favorite. She diverse three kinds of alcohol ( Beer, Spirits, and Wine) to analyze the consumption per serving in each country.
Article from FiveThirtyEight: http://fivethirtyeight.com/datalab/dear-mona-followup-where-do-people-drink-the-most-beer-wine-and-spirits/ Dataset: https://github.com/fivethirtyeight/data/blob/master/alcohol-consumption/drinks.csv
alcohol_DF <- read.csv(file="https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv",header = TRUE, sep=",")
summary(alcohol_DF)
## country beer_servings spirit_servings wine_servings
## Length:193 Min. : 0.0 Min. : 0.00 Min. : 0.00
## Class :character 1st Qu.: 20.0 1st Qu.: 4.00 1st Qu.: 1.00
## Mode :character Median : 76.0 Median : 56.00 Median : 8.00
## Mean :106.2 Mean : 80.99 Mean : 49.45
## 3rd Qu.:188.0 3rd Qu.:128.00 3rd Qu.: 59.00
## Max. :376.0 Max. :438.00 Max. :370.00
## total_litres_of_pure_alcohol
## Min. : 0.000
## 1st Qu.: 1.300
## Median : 4.200
## Mean : 4.717
## 3rd Qu.: 7.200
## Max. :14.400
#add a new column to category low, moderate, and high consumption alcohol based on the 1st Qu. and 3rd Qu.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ stringr 1.4.0
## ✓ tidyr 1.1.4 ✓ forcats 0.5.1
## ✓ readr 2.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
New_alcohol_DF <- alcohol_DF %>%
mutate(level_of_comsuption_alcohol = case_when(total_litres_of_pure_alcohol >=7.2 ~ "High",
total_litres_of_pure_alcohol >= 1.3 ~ "Moderate",
TRUE ~ "Low")
)
New_alcohol_DF %>%
slice(1:10)
## country beer_servings spirit_servings wine_servings
## 1 Afghanistan 0 0 0
## 2 Albania 89 132 54
## 3 Algeria 25 0 14
## 4 Andorra 245 138 312
## 5 Angola 217 57 45
## 6 Antigua & Barbuda 102 128 45
## 7 Argentina 193 25 221
## 8 Armenia 21 179 11
## 9 Australia 261 72 212
## 10 Austria 279 75 191
## total_litres_of_pure_alcohol level_of_comsuption_alcohol
## 1 0.0 Low
## 2 4.9 Moderate
## 3 0.7 Low
## 4 12.4 High
## 5 5.9 Moderate
## 6 4.9 Moderate
## 7 8.3 High
## 8 3.8 Moderate
## 9 10.4 High
## 10 9.7 High
high_consumption <- subset(New_alcohol_DF, level_of_comsuption_alcohol == "High")
high_consumption %>%
select(c(1,5,6))%>%
arrange(desc(total_litres_of_pure_alcohol))
## country total_litres_of_pure_alcohol level_of_comsuption_alcohol
## 1 Belarus 14.4 High
## 2 Lithuania 12.9 High
## 3 Andorra 12.4 High
## 4 Grenada 11.9 High
## 5 Czech Republic 11.8 High
## 6 France 11.8 High
## 7 Russian Federation 11.5 High
## 8 Ireland 11.4 High
## 9 Luxembourg 11.4 High
## 10 Slovakia 11.4 High
## 11 Germany 11.3 High
## 12 Hungary 11.3 High
## 13 Portugal 11.0 High
## 14 Poland 10.9 High
## 15 Slovenia 10.6 High
## 16 Belgium 10.5 High
## 17 Latvia 10.5 High
## 18 Australia 10.4 High
## 19 Denmark 10.4 High
## 20 Romania 10.4 High
## 21 United Kingdom 10.4 High
## 22 Bulgaria 10.3 High
## 23 Croatia 10.2 High
## 24 Switzerland 10.2 High
## 25 St. Lucia 10.1 High
## 26 Finland 10.0 High
## 27 Spain 10.0 High
## 28 South Korea 9.8 High
## 29 Austria 9.7 High
## 30 Serbia 9.6 High
## 31 Estonia 9.5 High
## 32 Netherlands 9.4 High
## 33 New Zealand 9.3 High
## 34 Nigeria 9.1 High
## 35 Gabon 8.9 High
## 36 Ukraine 8.9 High
## 37 USA 8.7 High
## 38 Argentina 8.3 High
## 39 Greece 8.3 High
## 40 Uganda 8.3 High
## 41 Canada 8.2 High
## 42 Cyprus 8.2 High
## 43 South Africa 8.2 High
## 44 St. Kitts & Nevis 7.7 High
## 45 Venezuela 7.7 High
## 46 Chile 7.6 High
## 47 Paraguay 7.3 High
## 48 Brazil 7.2 High
## 49 Panama 7.2 High
## 50 Sweden 7.2 High
To extend Mona’s analysis, I found that a high consumption alcohol country most likely has its alcohol production. Also, the country’s drinking culture is a significant reason for high levels of consuming alcohol while all high levels of consumption alcohol counties had the non-Islamic country.
extend and verify
library(ggplot2)
New_alcohol_DF %>%
ggplot()+
geom_bar(aes(y = level_of_comsuption_alcohol, fill =level_of_comsuption_alcohol))+
ggtitle("Number of countries in different Level of consumption alcohol")+
xlab("Count")+
ylab("")+
theme(legend.position = "none")