Instructions:


Prepare a report that has an interesting narrative that focuses on a subset of the data you find interesting that includes both arsenic and fluoride data. Your report should be uploaded to RPubs, and you should post a link to your RPubs report in Piazza. You are required to join the data. It is up to you to determine how to handle missing values. Your document title should be exactly Assignment 1: where and are your actual name. (10 points)

Also, the HTML document you publish to RPubs must have the following elements: * at least two level two headers ## and at least one bulleted list with at least two items (5 points)

Setup

My setup chunk looks like this:
{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(scipen = 999)
library(tidyverse)
library(janitor)
library(kableExtra)
library(scales)
library(gapminder)
library(viridis)

Load data

I loaded the data and changed the names of some long column names so they’re easier to work with in R. I also got rid of a few columns in the second dataset that I won’t need, because I’m going to merge the datasets together.

coast_vs_waste <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-21/coastal-population-vs-mismanaged-plastic.csv") %>% 
  dplyr::rename("Mismanaged_waste" = "Mismanaged plastic waste (tonnes)", "Coastal_pop" = "Coastal population", 
                "Total_pop" = "Total population (Gapminder)", "country" = "Entity")

coast_vs_waste %>% head()
## # A tibble: 6 x 6
##   country     Code   Year Mismanaged_waste Coastal_pop Total_pop
##   <chr>       <chr> <dbl>            <dbl>       <dbl>     <dbl>
## 1 Afghanistan AFG    1800               NA          NA   3280000
## 2 Afghanistan AFG    1820               NA          NA   3280000
## 3 Afghanistan AFG    1870               NA          NA   4207000
## 4 Afghanistan AFG    1913               NA          NA   5730000
## 5 Afghanistan AFG    1950               NA          NA   8151455
## 6 Afghanistan AFG    1951               NA          NA   8276820
mismanaged_vs_gdp <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-21/per-capita-mismanaged-plastic-waste-vs-gdp-per-capita.csv") %>% 
  dplyr::rename("Waste_percapita" = "Per capita mismanaged plastic waste (kilograms per person per day)", 
                "GDP_percapita" = "GDP per capita, PPP (constant 2011 international $) (Rate)",
                "country" = "Entity") %>% 
  dplyr::select(-`Total population (Gapminder)`, -Code)

mismanaged_vs_gdp %>% head()
## # A tibble: 6 x 4
##   country      Year Waste_percapita GDP_percapita
##   <chr>       <dbl>           <dbl>         <dbl>
## 1 Afghanistan  1800              NA            NA
## 2 Afghanistan  1820              NA            NA
## 3 Afghanistan  1870              NA            NA
## 4 Afghanistan  1913              NA            NA
## 5 Afghanistan  1950              NA            NA
## 6 Afghanistan  1951              NA            NA
gap_continent<- gapminder %>% 
  dplyr::select(country, continent)


Metadata:

Plastic pollution is a major and growing problem, negatively affecting oceans and wildlife health. Our World in Data has a lot of great data at the various levels including globally, per country, and over time.

coast_vs_waste.csv

variable class description name changed to
Entity Character Country Name country
Code Character 3 Letter country code Code
Year Integer (date) Year Year
Mismanaged plastic waste (tonnes) double Tonnes of mismanaged plastic waste Mismanaged_waste
Coastal population Double Number of individuals living on/near coast Coastal_pop
Total Population double Total population according to Gapminder Total_pop


mismanaged_vs_gdp.csv

variable class description name changed to
Entity Character Country Name country
Code Character 3 Letter country code Code
Year Integer (date) Year Year
Per capita mismanaged plastic waste (kg per day) double Amount of mismanaged plastic waste per capita in kg/day Waste_percap
GDP per capita Double GDP per capita constant 2011 international $, rate GDP_percapita
Total Population double Total population according to Gapminder Total_pop


Join datasets

In this line, I joined the two datasets together using left_join() and filtered out NA values in the Mismanaged_waste column using filter(!is.na())

waste_df<-coast_vs_waste %>% 
  left_join(mismanaged_vs_gdp, by = c("country", "Year")) %>% 
  filter(!is.na(Mismanaged_waste))

waste_df %>% head(10)
## # A tibble: 10 x 8
##    country Code   Year Mismanaged_waste Coastal_pop Total_pop
##    <chr>   <chr> <dbl>            <dbl>       <dbl>     <dbl>
##  1 Albania ALB    2010            29705     2530533   3204284
##  2 Algeria DZA    2010           520555    16556580  35468208
##  3 Angola  AGO    2010            62528     3790041  19081912
##  4 Anguil~ AIA    2010               52       14561     15358
##  5 Antigu~ ATG    2010             1253       66843     88710
##  6 Argent~ ARG    2010           157777    16449245  40412376
##  7 Aruba   ABW    2010              372      137910    107488
##  8 Austra~ AUS    2010            13889    17235954  22268384
##  9 Bahamas BHS    2010             1333      341145    342877
## 10 Bahrain BHR    2010             4376      743574   1261835
## # ... with 2 more variables: Waste_percapita <dbl>, GDP_percapita <dbl>


Create a table using kable()


Most mismanaged waste (top 5 producers)

I’ve hidden this code, because I’d like you to write your own code for creating a table, but I’m just displaying the top 5 countries with the greatest mismanaged waste, and I’ve sorted the values in descending order.

country Mismanaged_waste Total_pop Waste_percapita
China 8819717 1341335152 0.092
Indonesia 3216856 239870937 0.047
Philippines 1883659 93260798 0.062
Vietnam 1833819 87848445 0.090
Sri Lanka 1591179 20859949 0.299


Create a data visualization


Discuss findings


You should say some things like…
“It’s was surprising that countries with small populations had some of the largest mismanaged waste per capita because…”
or
“I was surprised that the USA had a pretty low mismanaged waste per capita. It seems like the USA has a lot of mismanaged waste to me!”
or
“It makes sense that small, wealthy nations like Sweden, Canada, and Japan would produce less mismanaged waste per capita. In contrast, developing nations like Sri Lanka would have many waste management challanges because of lack of infrastructure, which would lead to high levels of mismanaged waste despite small population sizes.”
or
“I strugged when figuring out how to remove missing values.”
or
“It was difficult to come up with an interesting and informative figure for these data because…”