Assignment 4

Let’s start by loading the libraries

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──

## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.0

## ── Conflicts ────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(tidyr)
library(dplyr)
library(fivethirtyeight)

Instructions: Instructions We are going to be using the drinks dataset from the fivethirtyeight package (you will need to install) reported in Mona Chalabi’s article “Dear Mona Followup: Where Do People Drink The Most Beer, Wine, and Spirits?”

Replicate, as best you can, the horizontal bar chart for the four countries shown below. Hint: you will need to convert from wide to long. Publish your chart and code as a report to RPubs and link your report in the submission text below.

chart example Grading criteria:

perfect replication - 100 points near-perfect replication (i.e., minor differences between the chart shown and your submission) - 90 points the chart shows correct data, but not a near-perfect replication - 80 points chart visually represents the data correctly but significantly differs from the one shown - 70 points chart visually represents the data incorrectly and significantly differs from the one shown - 60 points there is no chart but a long data frame was created with the proper data - 50 points

Now, let’s explore the file

?drinks

After running the helper code above, we see that drinks is a data frame with 193 rows representing countries and 5 variables:

country country

beer_servings Servings of beer in average serving sizes per person

spirit_servings Servings of spirits in average serving sizes per person

wine_servings Servings of wine in average serving sizes per person

total_litres_of_pure_alcohol Total litres of pure alcohol per person

Now, let’s do some data wrangling using tidy, filter the countries we want store it in drinks_Subgroup

drinks_Subgroup <- drinks %>% 
  filter(country %in% c("USA", "Seychelles", "Iceland", "Greece")) %>% 
  select(-total_litres_of_pure_alcohol) %>% 
  rename(beer = beer_servings, spirit = spirit_servings, wine = wine_servings)
drinks_Subgroup

## # A tibble: 4 x 4
##   country     beer spirit  wine
##   <chr>      <int>  <int> <int>
## 1 Greece       133    112   218
## 2 Iceland      233     61    78
## 3 Seychelles   157     25    51
## 4 USA          249    158    84

Now, let’s convert our data to tidy format:

drinks_smaller_tidy <- drinks_Subgroup %>% 
  pivot_longer(names_to = "type", 
               values_to = "servings", 
               cols = -country)
drinks_smaller_tidy

## # A tibble: 12 x 3
##    country    type   servings
##    <chr>      <chr>     <int>
##  1 Greece     beer        133
##  2 Greece     spirit      112
##  3 Greece     wine        218
##  4 Iceland    beer        233
##  5 Iceland    spirit       61
##  6 Iceland    wine         78
##  7 Seychelles beer        157
##  8 Seychelles spirit       25
##  9 Seychelles wine         51
## 10 USA        beer        249
## 11 USA        spirit      158
## 12 USA        wine         84

Now, let’s plot our data:

library(ggplot2)
ggplot(drinks_smaller_tidy, aes(x = country, y = servings, fill = type)) +
  geom_col(position = "dodge")+ coord_flip()

Assignment 4

Richard Bigega

11/1/2020