This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment.

Import data

library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   0.8.5
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
coast_vs_waste <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-21/coastal-population-vs-mismanaged-plastic.csv")
## Parsed with column specification:
## cols(
##   Entity = col_character(),
##   Code = col_character(),
##   Year = col_double(),
##   `Mismanaged plastic waste (tonnes)` = col_double(),
##   `Coastal population` = col_double(),
##   `Total population (Gapminder)` = col_double()
## )

mismanaged_vs_gdp <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-21/per-capita-mismanaged-plastic-waste-vs-gdp-per-capita.csv")
## Parsed with column specification:
## cols(
##   Entity = col_character(),
##   Code = col_character(),
##   Year = col_double(),
##   `Per capita mismanaged plastic waste (kilograms per person per day)` = col_double(),
##   `GDP per capita, PPP (constant 2011 international $) (Rate)` = col_double(),
##   `Total population (Gapminder)` = col_double()
## )

waste_vs_gdp <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-21/per-capita-plastic-waste-vs-gdp-per-capita.csv")
## Parsed with column specification:
## cols(
##   Entity = col_character(),
##   Code = col_character(),
##   Year = col_double(),
##   `Per capita plastic waste (kilograms per person per day)` = col_double(),
##   `GDP per capita, PPP (constant 2011 international $) (constant 2011 international $)` = col_double(),
##   `Total population (Gapminder)` = col_double()
## )
coast_vs_waste
## # A tibble: 20,093 x 6
##    Entity  Code   Year `Mismanaged plastic… `Coastal populat… `Total population…
##    <chr>   <chr> <dbl>                <dbl>             <dbl>              <dbl>
##  1 Afghan… AFG    1800                   NA                NA            3280000
##  2 Afghan… AFG    1820                   NA                NA            3280000
##  3 Afghan… AFG    1870                   NA                NA            4207000
##  4 Afghan… AFG    1913                   NA                NA            5730000
##  5 Afghan… AFG    1950                   NA                NA            8151455
##  6 Afghan… AFG    1951                   NA                NA            8276820
##  7 Afghan… AFG    1952                   NA                NA            8407148
##  8 Afghan… AFG    1953                   NA                NA            8542906
##  9 Afghan… AFG    1954                   NA                NA            8684494
## 10 Afghan… AFG    1955                   NA                NA            8832253
## # … with 20,083 more rows
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

# Data cleaning
clean_dataset <- function(tbl) {
  tbl %>%
    clean_names() %>%
    rename(country = entity,
           country_code = code) %>%
    filter(year == 2010) %>%
    select(-year)
}
plastic_waste <- coast_vs_waste %>%
  clean_dataset() %>%
  select(-total_population_gapminder) %>%
  inner_join(clean_dataset(mismanaged_vs_gdp) %>%
               select(-total_population_gapminder), by = c("country", "country_code")) %>%
  inner_join(clean_dataset(waste_vs_gdp), by = c("country", "country_code")) %>%
  select(country,
         country_code,
         mismanaged_waste = mismanaged_plastic_waste_tonnes,
         coastal_population,
         total_population = total_population_gapminder,
         mismanaged_per_capita = per_capita_mismanaged_plastic_waste_kilograms_per_person_per_day,
         gdp_per_capita = gdp_per_capita_ppp_constant_2011_international_rate) %>%
  filter(!is.na(mismanaged_waste))
mismanaged_vs_gdp
## # A tibble: 22,204 x 6
##    Entity  Code   Year `Per capita mismana… `GDP per capita, P… `Total populati…
##    <chr>   <chr> <dbl>                <dbl>               <dbl>            <dbl>
##  1 Afghan… AFG    1800                   NA                  NA          3280000
##  2 Afghan… AFG    1820                   NA                  NA          3280000
##  3 Afghan… AFG    1870                   NA                  NA          4207000
##  4 Afghan… AFG    1913                   NA                  NA          5730000
##  5 Afghan… AFG    1950                   NA                  NA          8151455
##  6 Afghan… AFG    1951                   NA                  NA          8276820
##  7 Afghan… AFG    1952                   NA                  NA          8407148
##  8 Afghan… AFG    1953                   NA                  NA          8542906
##  9 Afghan… AFG    1954                   NA                  NA          8684494
## 10 Afghan… AFG    1955                   NA                  NA          8832253
## # … with 22,194 more rows
waste_vs_gdp
## # A tibble: 22,204 x 6
##    Entity  Code   Year `Per capita plasti… `GDP per capita, PP… `Total populati…
##    <chr>   <chr> <dbl>               <dbl>                <dbl>            <dbl>
##  1 Afghan… AFG    1800                  NA                   NA          3280000
##  2 Afghan… AFG    1820                  NA                   NA          3280000
##  3 Afghan… AFG    1870                  NA                   NA          4207000
##  4 Afghan… AFG    1913                  NA                   NA          5730000
##  5 Afghan… AFG    1950                  NA                   NA          8151455
##  6 Afghan… AFG    1951                  NA                   NA          8276820
##  7 Afghan… AFG    1952                  NA                   NA          8407148
##  8 Afghan… AFG    1953                  NA                   NA          8542906
##  9 Afghan… AFG    1954                  NA                   NA          8684494
## 10 Afghan… AFG    1955                  NA                   NA          8832253
## # … with 22,194 more rows

Description of the data and definition of variables

The describtion of the data set is variable waste amounts catagorized by 3 letter symbols for the countries and defined by amount in metric tons. The data also offers many other variables like pouplation, more specificly costal population and the year of the data being recorded which he later evaluated as N/A for almost all the data given so he filtered it out to narrow his data results. Showing how he clean his data gives me better ways in applying them to my own skill set in cleaning my won data. The data specificly in coast vs. waste targets more on specifily waste levels on the coasts of countires rather than the total.

Visualize data

Hint: One graph of your choice.

ggplot(plastic_waste, aes(gdp_per_capita, mismanaged_per_capita)) + 
  geom_point() +
  scale_x_log10() +
  scale_y_log10()

What is the story behind the graph?

The story behind this graphs shows that very rich countries in those groups have relativly low rate of plstic waste and mismanagement and also shows a trent that countries with low wealth have a high rate of plastic waste. Theres no direct tight correalation but is seen primarily though middle-class families that waste and mismanage the most waste my Kg. There is a strong correlation within the graph that shows a lower gdp per capita results in high ratings in waste by counties.

Hide the messages, but display the code and its results on the webpage.

Write your name for the author at the top.

Use the correct slug.